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Preface 


This  study  culminated  in  the  production  of  several 
models  that  may  be  of  use  to  Air  Force  Leadership  in 
tackling  the  pilot  retention  problem.  During  the  course  of 
this  research,  it  became  evident  that  merely  providing  the 
models  would  be  of  little  value  if  they  were  not  presented 
in  some  sort  of  management  context.  I  therefore  presented 
the  modeling  effort  as  a  portion  of  the  turnover  management 
process . 

It  also  became  evident  that  the  long-term  retention 
problem  is  likely  to  get  worse.  This  is  due  to  several 
reasons:  airline  expansion,  pilot  retirements,  population 
demographics,  and  some  of  the  attempts  to  control  turnover 
themselves.  I  believe  the  latter  two  are  time  bombs  that 
must  be  dealt  with  now,  before  their  impact  is  felt. 

Any  thesis  is  a  synergistic  effort  and  I  will 
therefore  not  attempt  to  single  out  every  individual  who 
assisted  me  with  this  research.  I  hope  that  a  simple 
"thank  you"  to  the  Institute's  faculty  and  staff  will 
suffice.  I  would  be  remiss,  however,  if  I  did  not  thank  my 
wife,  Jeanette,  and  our  children,  Brian,  John,  Mary,  Peter, 
and  Thomas  for  their  patience  and  understanding.  I  know 
the  time  lost  cannot  be  made  up,  but  hopefully  we  will  all 
be  better  for  the  experience.  Thanks  also,  to  Mom  and  Dad. 

I  hope  these  models  may  be  of  some  use.  If  you  have 
any  questions,  you  can  find  me  on  the  golf  course. 

Bruce  A.  Guzowski 
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Abstract 


Personnel  planners  in  various  Air  Force  agencies  use 
models,  among  other  things,  to  ajd  them  in  forecasting 
pilot  retention  rates.  This  rchear.gh*  effort  attempted  to 
forecast  retention  rates  three  years  ahead  with  the  use  of 
multiple  regression  analysis  techniques.  Such  models  can 
be  of  use  to  Air  Force  leaders  to  develop  proactive 
policies  and  programs  to  combat  poor  retention  forecasts. 

Economically  quantifiable  variables  were  primarily 
used  in  the  modeling  effort.  However,  some  year  groups 
could  not  be  adequately  explained  with  the  use  of  economic 
variables  alone.  The  models  for  year  groups  eight, 
twelve,  and  thirteen  used  the  retention  rates  of  "peer 
groups"  to  assist  in  explaining  their  own  retention  rates. 

All  models  were  subjected  to  common  internal  tests 
associated  with  linear  regression.  External  validity  was 
verified  by  the  use  of  a  withheld  data  set.  Forecasts 
were  made  for  Fiscal  Years  90,  91,  and  92,  using 
independent  variable  data  from  1987,  1988,  and  1989, 
respectively.  All  tests  and  forecasts  were  thoroughly 
documented. 

The  practical  and  policy  implications  of  these 
forecasts  were  discussed,  and  some  thoughts  about  possible 
policies  and  programs  to  increase  retention  were  advanced. 
Improvements  to  further  the  utility  of  these  models  were 
suggested.  /  \ 


A  METHODOLOGY  FOR  LONG-TERM  FORECASTS 
OF  AIR  FORCE  PILOT  RETENTION  RATES: 

A  MANAGEMENT  PERSPECTIVE 


I.  Introduction 


General  Issue 

Employee  turnover  in  any  organization  can  be  very 
costly  if  not  controlled.  In  the  Air  Force,  the  loss  of  a 
single  pilot  to  the  civilian  sector  represents  a  cost  of 
millions  of  dollars  in  training  and  experience  (11:11). 
Additionally,  if  large  numbers  of  combat-capable  pilots 
depart  the  service  before  they  are  eligible  for  retirement, 
a  potential  exists  for  a  pilot  shortage  within  the  Air 
Force,  where  there  are  not  enough  pilots  to  perform  the  jobs 
which  require  a  pilot's  presence  (flying  or  staff  duty). 

Such  a  shortage  could  quickly  translate  into  lost  war¬ 
fighting  capability. 

A  1984  Air  War  College  Research  Report  discussed  the 
consequences  of  pilot  shortages  under  manning  policies  in 
place  at  that  time.  The  report  noted  that  previous  pilot 
shortages  were  solved  by  drawing  on  the  surplus  of  pilots 
within  the  rated  supplement  program  (a  program  that  assigns 
rated  officers  [pilots  and  navigators]  to  non-flying 
positions).  Thus,  a  surplus  of  rated  officer  experience  was 
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maintained  during  periods  wh^n  pilot  retention  was  high,  and 
this  "reserve”  was  drawn  upon  to  fill  vacancies  created  when 
retention  rates  fell  below  levels  required  to  maintain 
combat  readiness.  However,  current  pilot  losses  are  greater 
than  what  the  rated  supplement  program  can  support.  While 
Air  Force  policy  is  geared  toward  retaining  those  pilots 
already  on  active  duty,  vacancies  created  by  turnover  withir 
this  group  are,  for  the  first  time  in  Air  Force  history, 
being  filled  by  younger,  less  experienced  officers  (13:20). 
Indeed,  as  turnover  becomes  ever  larger,  the  effect  of 
filling  vacancies  from  below  will  inevitably  drive  the  high 
standard  of  readiness  the  Air  Force  has  traditionally 
maintained  to  some  lower  level  (13:19). 

It  should  be  noted  that  all  turnover  does  not  bear 
bitter  fruit.  Indeed,  the  Air  Force  recognizes  this,  and 
has  historically  planned  for  a  cumulative  retention  rate  of 
pilots  within  the  six  to  eleven  year  group  of  sixty  percent. 
In  other  words,  for  every  ten  pilots  entering  their  seventh 
year  of  active  duty,  the  Air  Force  plans  on  having  six  of 
those  pilots  on  active  duty  by  the  end  of  their  eleventh 
year  of  service.  This  translates  to  roughly  a  ten  percent 
turnover  rate  per  year,  which  is  comparable  to  the  rates 
that  civilian  firms  plan  for  (28).  This  planned  turnover 
rate  is  functional  turnover,  where  the  health  of  the 
organization  is  not  jeopardized  by  these  "programmed" 
losses.  However,  turnover  rates  in  excess  of  those  planned, 
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are  dysfunctional  and  need  to  be  corrected  if  the 
organization  is  to  remain  healthy  and  viable. 


Turnover  Research 

There  have  been  over  1000  studies  of  employee  turnover 
during  this  century  (18:82).  In  1977,  William  H.  Mobley 
produced  his  Intermediate  Linkages  Model  of  employee 
turnover  (Figure  1),  which  focused  on  turnover  as  a  process. 
In  Mobley's  research,  the  intention  to  quit  was  deemed  to  be 
the  only  reliable  predictor  of  the  turnover  event  (18:122). 
His  research  has  become  the  foundation  of  modern  studies  on 
employee  turnover  (28) .  Mobley  asserts  that  employee 
turnover  is  manageable  in  a  dynamic  environment: 

The  manager  must  be  able  to:  diagnose  the 
nature  and  probable  determinants  of  turnover  in 
his  organization;  assess  the  orobable 
individual  and  organizational  consequences  of  the 
various  types  of  turnover;  design  and  implement 
policies,  practices,  and  programs  for  effectively 
dealing  with  turnover;  evaluate  the 
effectiveness  of  changes;  and  anticipate  further 
changes  to  effectively  manage  turnover .... (18 :78) 

Mobley  offers  a  graphic  portrayal  of  the  management 
view  of  this  turnover  process  (Figure  2) . 


A  Systems  Approach 

General  systems  theory  says  that  an  organization  may  be 
viewed  as  a  system  that  interacts  with  its  environment  in  an 
analogous  manner  to  biological  systems: 
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Notes: 

( 1 )  Alternative  forms  of  withdrawal,  e.g.,  absenteeism,  passive 
job  behavior 

(2)  Non- job  related  factors,  e.g.,  transfer  of  spouse,  mag 
stimulate  intention  to  search 

(3)  Unsolicited  or  highly  visible  alternatives  may  stimulate 
eval  uation 

(4)  Impulsive  behavior 

Figure  1.  Mobley’s  Intermediate  Linkages  Model  (18:123) 
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Figure  2.  Management  Perspective  of  the 
Turnover  Process  (18=12) 
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.  .  .  Richard  Johnson,  Fremon  Kast,  and  James 
Rosenzweig  related  the  corporate  enterprise 
structure  ...  to  an  open-ended  cell: 

An  organism  is  an  open  system  which  maintains  a 
constant  state  while  matter  and  energy  which  enter 
it  keep  changing  (so-called  dynamic  equilibrium) . 
The  organization  is  influenced  by,  and 
influences,  its  environment.  Such  a  description 
of  a  system  adequately  fits  the  typical  business 
organization.  The  business  organization  is  a 
man-made  system  which  has  dynamic  interplay  with 
its  environment  —  customers,  competitors,  labor 
organizations,  suppliers,  government  and  many 
other  agencies.  Furthermore,  the  business 
organization  is  a  system  of  interrelated  parts 
[subsystems]  working  in  conjunction  with  each 
other  in  order  to  accomplish  a  number  of  goals, 
both  those  of  the  organization  and  those  of  the 
individual  participants.  (15:66) 


Any  system  or  subsystem  takes  inputs,  processes  them 
(throughput) ,  and  produces  outputs  (5).  Figure  3. a.  is  a 
simple  depiction  of  this  systems  model.  The  vertical  arrows 
in  and  out  of  the  throughput  box  represent  interaction  with 
the  environment  in  which  the  system  exists. 

Figure  3.b.  is  an  attempt  to  model  the  Air  Force 
personnel  system  with  respect  to  systems  theory.  Here, 
inputs  may  be  viewed  as  recruits.  The  throughput  box  may  in 
turn  be  seen  as  Mobley’s  turnover  process  (Figure  1).  Note 
here  that  interaction  with  the  environment  is  depicted  as 
being  one  way  (out) .  This  is  intended  to  show  that  once 
pilot  turnover  occurs  (vertical  arrows) ,  replacement 
currently  comes  only  from  within,  through  more  inputs. 
Outputs  may  be  viewed  in  this  second  model  as  functional 
turnover  and  retirement. 
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— >  Throughput  — 


Figure  3.a.  Simple  Systems  Model 


^P^ — *  Throughput  Quiput^ 


Figure  3.b.  Modified  Systems  Model 
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Systems  that  freely  interact  with  the  environment  are 
known  as  open  systems,  while  those  that  do  not  are  called 
closed  systems.  Since  any  organizational  system  is 
inherently  open,  it  thus  becomes  useful  to  view  its  degree 
of  openness.  Organizational  systems  may  therefore  be  seen 
as  relatively  open  or  relatively  closed  (17:65).  The  model 
in  Figure  3.b.  may  be  seen  as  a  relatively  closed  system. 

Tom  Peters,  in  his  book,  Thriving  on  Chaos,  offers  this  view 
of  the  interaction  open  systems  must  have  with  their 
environments : 

The  winners  of  tomorrow  will  deal  proactively 
with  chaos,  will  look  at  the  chaos  per  se  as  the 
source  of  .  .  .  advantage,  not  as  a  problem  to 
be  got  around.  Chaos  and  uncertainty  are  .  .  . 
opportunities  for  the  wise  (23:xiv). 

When  a  system  chooses  to  limit  interaction  with  its 

environment,  it  "buys"  short  term  stability  at  the  expense 

of  long  term  stability  (5),  (17:66).  The  limited 

interaction  with  the  environment  depicted  in  the  second 

model  may  then  be  seen  as  a  source  of  long  term  instability. 

An  organization's  limited  interaction  with  its  environment 

may  manifest  itself  in  the  form  of  controls  or  regulations, 

often  not  producing  the  desired  results.  Peter  Drucker,  in 

The  New  Realities ,  states: 

.  .  .  The  Chicago  economist  George  J.  Stigler 
(winner  of  the  1982  Nobel  prize  in  Economics)  has 
shown  in  years  of  painstaking  research  that  not 
one  of  the  regulations  through  which  the  U.S. 
Government  has  tried  over  the  years  to  control, 
direct,  or  regulate  the  economy  has  worked.  They 
were  either  ineffectual  or  produced  the  opposite 
of  the  intended  results.  (6:166) 
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USAF  Pilot  Turnover 


Mobley  has  identified  four  general  classes  of  turnover 
determinants : 

external  economy :  unemployment,  inflation,  etc. 
organizational  variables:  e.g.,  reward  system, 
job  design,  leadership 

individual  non-work  variables:  e.g. , spouse's 
career,  family  responsibility 
individual  work  related  variables:  e.g., 
values,  expectations,  commitment  (18:78) 

Concentrating  on  any  single  one  of  these  determinants  will 

not  give  the  Air  Force  leader  a  complete  picture  of  the 

turnover  process  within  his  organization.  However,  a 

significant  positive  correlation  has  been  found  to  exist 

between  pilot  turnover  and  domestic  airline  pilot  hiring 

activity.  This  hiring  activity  has  in  turn  been  shown  to  be 

positively  correlated  to  general  economic  strength  (8:15). 

These  positive  correlations  have  been  the  basis  of  some 

models  used  by  Air  Force  planners  to  forecast  pilot  turnover 

(25:14) . 

The  View  of  Management . 

Air  Force  leadership  has  been  following  Mobley's 
paradigm  for  turnover  management.  Attempts  are  made  to 
anticipate  turnover  and  its  determinants  are  assessed 
through  various  studies  and  surveys.  The  costs  and 
consequences  of  pilot  turnover  are  subsequently  calculated, 
and  policies  and  programs  are  then  implemented  which  address 
the  negative  effects  of  pilot  turnover. 
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As  pilot  turnover  in  recent  years  has  become 
dysfunctional  within  the  Air  Force,  several  remedial 
measures  have  been  implemented  by  Air  Force  leaders  to 
address  suspected  turnover  variables.  These  efforts  have 
been  primarily  directed  at  the  organizational  variables 
determinant,  since  Mobley's  other  classes  of  determinants 
are  less  easily  influenced  by  leaders.  Turnover  may  be  seen 
as  a  measure  of  employee  morale  (28)  .  When  morale  (and 
hence  satisfaction)  is  high,  turnover  is  low.  The  converse 
is  also  true.  One  method  by  which  Air  Force  leadership  has 
attempted  to  address  the  perceived  morale  problem  is  through 
giving  the  pilot  a  distinct  identity.  The  Air  Force  of  the 
1950s  had  nearly  60,000  pilots.  Today,  there  are  roughly 
one- third  that  number  on  active  duty  (13:9) .  Indeed,  in 
today's  Air  Force,  officers  that  are  pilots  are  in  the 
minority.  The  issuance  of  leather  flight  jackets  to 
aircrews  was  reinstituted  in  part  to  combat  the  problem  of 
pilots  failing  to  identify  themselves  with  the  Air  Force, 
and  to  therefore  increase  their  esprit  de  corps,  or  morale. 

Another  organizational  variable  addressed  by  Air  Force 
leadership  is  the  reward  system  (compensation)  for  pilots. 
Beginning  in  January  of  1989,  a  pilot  who  had  completed  the 
initial  active  duty  service  commitment  (ADSC)  incurred  from 
Undergraduate  Pilot  Training  (UPT) ,  with  less  than  14  years 
of  total  service,  was  eligible  to  receive  a  bonus  payment  of 
up  to  $12,000  per  year.  In  return  for  the  these  payments, 
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the  pilot  obligated  himself  to  serve  through  the  14-year 
point  of  total  service. 

A  compensation  measure  more  recently  enacted  by 
Congress  increased  the  monthly  payments  pilots  (and 
navigators)  receive  under  the  Aviation  Career  Incentive  Pay 
( AC I P )  act.  Otherwise  known  as  "flight  pay,"  these  payments 
have  been  raised  to  a  maximum  of  $650  per  month. 

According  to  Mobley,  once  such  policies  and  programs 
are  implemented,  their  effectiveness  must  be  evaluated. 

While  it  is  not  the  author's  intent  to  judge  the 
effectiveness  of  the  aforementioned  programs,  voluntary 
retention  rates  have  not  been  significantly  positively 
affected  by  these  measures,  though  the  decline  in  retention 
rates  appears  to  be  leveling  off  (9:9). 

The  key  to  Mobley's  management  perspective  of  the 
turnover  process  lies  in  management's  ability  to  anticipate 
the  turnover  event.  Theoretically,  if  an  event  is 
anticipated  far  enough  in  advance  of  the  time  that  it 
occurs,  measures  may  be  taken  by  management  to  either 
rectify  the  determinant  causes  of  the  event,  and  hence  alter 
the  outcome,  or  act  to  mitigate  the  consequences  of  the 
event  if  it  indeed  occurs.  Thus,  the  previously  discussed 
initiatives  taken  by  Air  Force  leaders  may  be  seen  as  being 
directed  at  eliminating  the  causes  of  turnover. 

However,  all  causes  of  turnover  are  not  under  the 
control  of  the  Air  Force  leader.  The  strong  positive 
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correlation  between  airline  hiring  and  pilot  turnover  is  but 
one  example:  the  Air  Force  leader  has  no  ability  to  control 
airline  hiring.  As  airlines  continue  to  raid  the  Air 
Force's  ’’bank"  of  trained  pilot  resources  (and  it  appears 
they  wish  to  continue  to  do  so  for  the  coming  decade 
[30:S10;  22:101]),  another  policy  has  been  implemented  by 
Air  Force  leadership  in  an  attempt  to  deny  these  resources 
from  the  competition:  lengthen  active  duty  service 
commitments  (ADSCs)  incurred  by  those  graduating  from  UPT. 

Figure  4  plots  ADSCs  incurred  by  pilots.  The  ADSC  for 
UPT  has  gone  from  six  years  for  a  pilot  graduating  in  1980 
to  10  years  for  one  who  graduates  today  (10:7).  Clearly, 
increasing  ADSCs  incurred  from  UPT  is  a  very  effective 
method  for  maintaining  the  desired  number  of  pilots  within 
the  Air  Force.  Such  policies,  though,  may  not  be  fully 
evaluated  until  these  pilots  are  eligible  to  leave  the 
service.  Thus,  the  impact  of  a  10-year  ADSC  for  pilots 
graduating  from  UPT  in  1990  will  not  be  known  until  the  year 
2000. 

Anticipating  USAF  Pilot  Turnover. 

In  1987,  an  Air  Force  Institute  of  Technology  (AFIT) 
graduate  student.  Captain  James  R.  Simpson,  conducted 
research  on  the  revision  of  the  Econometric  Adjustment 
Model,  which  is  one  model  used  by  Air  Force  planners  to 
forecast  pilot  retention  rates.  The  result  of  his  research 
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was  a  Pay  Model,  which  used  regression  analysis  techniques 
to  forecast  pilot  retention  rates  by  one  year. 

Econome tri cs . 

"Econometrics  is  the  art  and  science  of  using 
statistical  methods  for  the  measurement  of  economic 
relations"  (3:1).  An  econometric  model  is  one  that 
generally  uses  regression  analysis  methods  to  forecast 
future  values  of  a  desired  event  (24:1).  The  event  of 
interest  in  this  research  is  USAF  voluntary  pilot  retention 
rates.  Voluntary  retention  rates  are  calculated  by 
subtracting  the  number  of  non-retirement  eligible  pilots  who 
leave  the  Air  Force  each  fiscal  year  (FY) ,  by  year  group, 
from  the  number  of  pilots  in  the  same  year  group,  who  are 
eligible  for  at  least  one  day  during  the  same  FY,  to  leave 
the  service.  This  number  is  in  turn  divided  by  the  eligible 
population  to  provide  the  retention  rate. 

An  econometric  model  employs  what  is  known  as  the 
causal  method  of  forecasting,  where  some  economic  events  are 
shown  to  cause  other  economic  events.  For  example,  a  simple 
econometric  model  would  be  one  that  links  unemployment 
levels  to  claims  for  unemployment  insurance  payments  (a  rise 
in  the  unemployment  rate  would  cause  a  rise  in  the  number  of 
claims  for  unemployment  compensation) .  Econometric  models 
that  use  regression  analysis  are  capable  of  producing  valid 
long-term  (2  to  5  years)  forecasts  (31:38). 
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In  his  recommendations  for  further  research.  Captain 


Simpson  stated: 


.  .  .  enhancements  to  this  model  that  would 
increase  its  utility  include  the  following:  .  .  . 
predict  retention  rates  in  the  out  years  (2  or 
more  years  ahead)  .  .  .  .(25:55) 

Specific  Problem 

The  Pay  Model  developed  by  Captain  Simpson  adequately 
predicted  turnover  rates  of  Air  Force  pilots  in  year  groups 
7,  8,  9,  10,  and  11.  However,  the  short  lead  times  it 
provided  did  not  give  Air  Force  planners  enough  warning  of 
significant  movements  in  pilot  retention  rates.  To  assist 
Air  Force  planners  in  the  vital  area  of  rated  personnel 
management,  a  model  that  can  provide  accurate  retention 
predictions  with  at  least  two  years  lead  time  is  needed. 

Research  Objective 

Captain  Simpson's  research  showed  that  regression 
analysis  was  a  valid  technique  to  predict  Air  Force  pilot 
turnover.  Yet,  the  computer  program  he  developed  provided 
only  twelve-month  lead  times.  The  objective  of  this 
research  is  to  develop  a  regression  model  that  will 
adequately  predict  pilot  retention  three  years  in  advance. 

Scope 

For  the  purpose  of  this  research,  the  population  of 
concern  will  be  those  Air  Force  pilots  who  have  completed  a 
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minimum  of  seven  but  not  more  than  14  years  of  service  (YOS) 
in  the  USAF.  These  year  groups  comprise  the  target  year 
groups  of  current  Air  Force  pilot  retention  efforts,  as 
evidenced  by  the  policies  and  programs  mentioned  in  this 
chapter . 
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II.  The  Retention  Model 


Introduction 

An  econometric  model  that  attempts  to  predict  future 
values  of  some  dependent  variable  is  known  as  a  forecast. 
Econometric  forecasting  uses  independent  variables  (IVs) , 
generally  in  a  regression  equation,  to  predict  future 
values  of  a  dependent  variable  (DV) .  These  IVs  are  said 
to  display  causality  of  the  DV.  In  other  words,  each  IV 
should  have  some  predictability  of  the  DV  (12:195). 

As  previously  stated,  general  economic  strength  is  an 
accurate  predictor  of  airline  hiring,  and  in  turn,  pilot 
turnover.  It  must  therefore  be  determined  which  variables 
may  be  able  to  predict  economic  strength  or  airline  hiring 
three  years  hence.  Such  variables  are  known  as  leading 
indicators,  and  their  use  in  regression  analysis 
forecasting  is  highly  desirable  (1:122).  A  leading 
indicator  displays  what  is  known  as  pure  delay .  Pure 
delay  exists  when  the  movement  of  the  DV  responds  to 
movements  in  the  IV  some  period  earlier.  Once  these 
variables  are  discerned,  the  only  other  data  required  to 
construct  the  model  would  be  the  turnover  rates 
themselves,  as  these  are  the  events  that  are  to  be 
predicted . 
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Calendar  Year  Data 


In  searching  for  variables  that  would  successfully 
predict  pilot  turnover  three  years  in  advance,  the  author 
learned  that  most  models  used  by  Air  Force  planners  deal 
with  data  in  fiscal  years  (FYs) .  A  fiscal  year  currently 
runs  from  October  1  in  one  year  to  September  30  of  the 
next  year.  Therefore,  to  properly  lag  IVs  to  some  point 
in  time  prior  to  the  movement  of  the  DV,  similar  time 
units  should  be  used.  However,  if  one  were  more 
interested  in  demonstrating  actual  causality  between  a  set 
of  IVs  and  the  DV,  similar  time  units  may  not  be  so 
critical.  Indeed,  if  one  were  to  closely  examine  the  data 
used  in  some  long-term  forecasts,  one  would  see  that  even 
though  a  time  unit  may  be  the  same  for  the  IV  and  DV,  as 
the  lagging  increases,  the  relative  significance  of 
maintaining  the  same  time  unit  decreases. 

This  research  will  attempt  to  predict  pilot  turnover 
three  years  ahead  of  the  fiscal  year  in  which  it  occurs. 
Obviously,  pilots  will  be  leaving  the  Air  Force  during  the 
entire  FY.  Thus,  any  attempts  to  truly  model  turnover 
with  consistent  time  units  should  attempt  to  do  so  with 
monthly  or  even  weekly  data.  While  some  economic  data  do 
exist  in  monthly  or  weekly  formats,  they  may  not  be  valid 
predictors  of  the  DV. 

The  author  believes  that  identifying  valid  causal 
variables  is  more  important  than  a  perfect  time  unit 
match.  Since  most  IVs  the  author  researched  are  expressed 
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in  terms  of  the  calendar  year,  this  will  be  the  time  unit 
of  the  IVs,  while  the  time  unit  of  the  DV  will  be  in 
fiscal  years.  Clearly,  any  classical  regression  notation 
with  lagged  variables  would  have  problems  accommodating 
such  a  departure.  While  the  author  recognizes  this,  he 
believes  that  the  actual  predictability  of  the  DV  is  of 
greater  concern.  If  the  reader  has  difficulty  accepting 
this  departure  from  standard  data  management  techniques, 
the  author  suggests  that  it  may  be  easier  to  view  the  IVs 
as  being  lagged  by  33  months,  rather  than  the  3  years  (36 
months)  suggested  in  the  research  objectives  section  of 
the  previous  chapter.  To  avoid  confusion,  however,  the 
author  will  continue  to  refer  to  the  IVs  as  simply  being 
lagged  by  3  years. 


Description  of  Data 

Peter  Drucker,  in  The  New  Realities,  discusses  the 

inability  of  economic  theory  to  predict  future  events: 

Every  earlier  economic  theory  postulated  that  one 
such  economy  [microeconomy ,  macroeconomy]  totally 
controls;  all  others  are  dependent  and 
"functions."  In  the  marginal-utility  world  of 
the  neoclassicists,  the  microeconomy  of 
individuals  and  firms  controls  the  macroeconomy 
of  government.  In  the  Keynesian  and  Post- 
Keynesian  worlds,  the  macroeconomy  of  national 
money  and  credit  controls  the  microeconomy  of 
individuals  and  firms.  But  economic  reality  now 
is  one  of  three  such  economies.  And  soon  the 
economic  region  (as  in  the  European  Economic 
Community) ,  may  become  a  fourth  semi-dependent 
economy.  Each,  to  use  a  mathematicians  term,  is 
a  partially  dependent  variable.  None  totally 
controls  the  other  three;  none  is  totally 
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controlled  by  the  others,  either.  Such 
complexity  can  barely  be  described.  It  cannot  be 
analyzed  since  it  allows  of  no  prediction. 

(6:157) 

So,  accurately  predicting  the  movement  of  a  large, 
complex,  economy  is  not  possible.  While  econometric 
methods  use  mathematical  models  and  statistical  inference 
to  forecast  future  events,  today's  economy  is  controlled 
by  factors  that  are  not  statistically  significant. 

Consider  the  Butterfly  Effect,  a  rigorous,  albeit 
whimsical,  mathematical  proof  that  shows  how  a  butterfly 
flapping  its  wings  in  the  Amazon  jungle  affects  the 
weather  in  Chicago  weeks  or  months  later.  The  point  is, 
in  a  large,  complex  economy,  the  insignificant  events  are 
likely  to  be  the  ones  with  the  greatest  impact. 
Furthermore,  these  events,  by  definition,  can  be  neither 
anticipated  nor  controlled.  Indeed,  they  may  even  go 
undetected  even  after  they  have  had  their  impact. 
(6:165-166) 

Thus,  an  aggregate  model  of  today's  complex  economy 
is  not  possible.  Yet,  if  one  were  to  view  the  economic 
world  as  a  "very  large  and  interdependent  system  of 
simultaneous  stochastic  equations"  (3:309),  then  the  basis 
for  decomposing  the  economy  into  areas  of  predictability 
exists.  In  the  world  of  econometrics,  this  is  done  by 
assuming  the  impact  of  some  variables  is  so  miniscule, 
that  by  treating  them  as  zero  will  result  in  very  small 
errors  when  estimating  the  impact  of  the  variables 
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included  in  a  regression  equation  (3:309).  Thus,  while 
Drucker  maintains  that  current  economic  theory  cannot 
entirely  explain  the  complexity  of  today,  "theorems  — 
formulae  and  formulations  to  describe  this  or  that 
phenomenon  and  solve  this  or  that  problem  . . . [are  still 
possible]"  (6:157). 

Searching  for  economic  variables  that  predict  USAF 
pilot  turnover  may  appear  to  be  a  monumental  task. 

However,  if  one  refers  to  the  Mobley  Intermediate  Linkages 
Model  of  the  turnover  process,  some  guidelines  may  be 
established  for  variable  inclusion.  The  author  has 
limited  his  search  to  those  economically  related 
quantifiable  variables  that  either  cause  a  pilot  to 
experience  job  dissatisfaction  or  cause  alternative 
employment  to  become  available. 

Data  Vari abl es . 

The  following  list  of  economic  indicators  will  be 
investigated  for  inclusion  as  explanatory  variables  in  the 
regression  model  to  be  constructed.  They  are  presented  in 
the  following  format:  data  title,  (short  title),  data 
description  and  justification,  and  data  source.  Appendix 
A  contains  the  data  sets  described  below. 

Pay  Compensation  -  (comp)  -  This  variable  measures 
the  relative  difference  between  military  and  civilian 
earnings.  It  is  stated  as  the  ratio  between  military  and 
civilian  pay,  so  a  figure  of  1.0  would  denote  complete 
equality,  while  those  less  than  1.0  would  show  greater 


21 


economic  reward  in  the  civilian  sector,  with  figures 
greater  than  1.0  showing  the  opposite.  This  variable 
should  explain  economic  job  satisfaction  or 
dissatisfaction  experienced  by  pilots.  If  the  ratio 
increases,  turnover  should  decrease  (25:16-17).  (Note:  the 
value  for  1974  was  not  available.  The  author  estimated 
1974  using  simple  linear  regression.)  Source: 

Headquarters  Air  Force  Military  Personnel  Center 
(HQAFMPC ) /DPMYAP . 

Percent  of  Population  in  Age  Group  25  to  64  -  (perc) 

-  This  age  group  would  most  likely  contain  the  major 
portion  of  the  business  travelling  public.  Since  most  air 
travel  is  performed  by  businessmen  (21:154),  increases  or 
decreases  in  its  size  may  foretell  similar  airline 
activity.  Source:  U.S.  Bureau  of  the  Census,  Current 
Population  Reports,  Series  P-25. 

Net  Population  Increase ,  per  1,000  Population  - 
(netgrow)  -  Although  this  would  be  a  very  broad  indicator 
of  eventual  increases  in  economic  activity,  it  may  have 
some  value  in  the  long-term  prediction  of  that  activity. 
This  figure  is  derived  by  subtracting  the  death  rate  from 
the  birth  rate,  and  adding  the  immigration  rate  (all  per 
1,000  population)  from  the  same  year.  Source:  the  same  as 
for  perc. 

Civilian  Labor  Force  Participation  Rate  -  (lfpart)  - 
This  figure  represents  the  proportion  of  the 
noninstitutional  civilian  population  in  the  civilian  labor 
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force.  The  civilian  labor  force  is  comprised  of  all 
civilians  classified  as  employed  or  unemployed.  As  labor 
force  participation  increases,  one  may  find  a  positive 
correlation  to  airline  activity  and  hence,  the  need  for 
pilots.  Source:  U.S.  Bureau  of  Economic  Analysis,  Survey 
of  Current  Business . 

Net  Business  Formation  -  (nbf)  -  With  a  base  year  of 
1967  (where  1967  =  100) ,  this  data  records  the  change  in 
business  incorporations  less  business  failures.  As  the 
value  of  this  indicator  increases  or  decreases,  so  should 
economic  and  hence,  airline  activity.  One  would  expect 
then  to  find  a  positive  correlation  between  increases  in 
this  statistic  and  pilot  turnover.  Source:  U.S.  Bureau  of 
the  Census:  Statistical  Abstract  of  the  United  States; 
also:  Survey  of  Current  Business. 

Civil  Aircraft  Shipments  ~  (acship)  -  The  number  of 
large  transports  (greater  than  70  passenger  capacity) , 
shipped  per  year.  A  positive  correlation  may  exist 
between  the  addition  of  new  aircraft  to  airline  fleets  and 
future  pilot  demand  by  those  airlines.  Source: 

Statistical  Abstract  of  the  United  States. 

Aerospace  Sales,  Net  New  Orders  -  (sales)  -  Derived 
from  reports  submitted  by  companies  whose  principal 
business  is  the  development  and/or  production  of  aircraft, 
aircraft  engines,  missile  and  spacecraft  engines,  missiles 
and/or  spacecraft.  Figures  represent  new  orders  received 
during  the  year  less  cancellations.  Dollar  figures  are 
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reported  in  then-year  dollars  (billions) ,  and  must  be 
adjusted  to  a  base  year.  The  overall  Gross  National 
Product  (GNP)  implicit  price  deflator  (1982  =  100)  will  be 
used  to  convert  these  data.  The  result  may  have  a 
predictive  effect  on  airline  hiring,  since  new  aircraft 
ordered  by  airlines  will  eventually  increase  the  demand 
for  pilots  in  the  overall  industry.  Source:  U.S.  Bureau 
of  the  Census,  Current  Industrial  Reports,  Series  MA37D. 

Machine  Tools,  Orders  and  Shipments  - 
(cut/form/mttot)  -  Since  metal  cutting  and  metal  forming 
are  two  primary  processes  by  which  aircraft  and  their 
subsystems  are  manufactured,  orders  for  machines  that 
perform  these  functions  may  have  a  long-term  predictive 
ability  on  aircraft  manufacturing  and  airline  hiring. 

Since  data  is  available  on  either  the  cutting  (cut)  or 
forming  (form)  tools,  three  IVs  will  be  investigated:  cut, 
form,  and  if  neither  proves  significant  alone  or  in 
combination,  the  total  (mttot)  will  then  be  investigated 
for  possible  inclusion  in  the  model.  Data  are  reported  in 
then-year  dollars  (millions) ,  and  must  be  converted  to  a 
base  year  for  comparison  purposes.  Conversion  will  be 
accomplished  in  the  manner  described  in  the  previous 
variable’s  discussion.  Source:  U.S.  Bureau  of  Economic 
Analysis,  Business  Statistics;  also:  Survey  of  Current 
Business . 

GNP  Implicit  Price  Deflator  -  (GNP)  -  An  implicit 
price  deflator  is  derived  as  the  "ratio  of  a  current 
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dollar  estimate  (for  GNP  or  a  component)  to  its 
corresponding  constant  dollar  estimate  multiplied  by  100. 

.  .  .  Changes  in  an  implicit  price  deflator  reflect  not 
only  changes  in  prices  but  also  changes  in  the  composition 
of  GNP  or  a  component”  (2:303).  Changes  in  the  deflator 
itself  may  have  some  broad  explanatory  ability  on  the 
availability  of  future  employment  alternatives  for  the 
workforce  in  general  (31:232).  Source:  U.S.  Bureau  of 
Economic  Analysis,  Business  Statistics ,  also.  Survey  of 
Current  Business . 

Scheduled  Commercial  Air  Carriers ,  Percent  Load 
Factor  -  (lofac)  -  This  data  is  derived  by  dividing  the 
revenue  passenger  miles  flown  by  U.S.  scheduled  air 
carriers  on  domestic  routes  by  the  actual  number  of 
available  seat  miles  by  the  same  carriers  on  the  same 
routes.  This  data  would  reveal  not  only  current  airline 
industry  health,  but  also  the  level  of  demand  for  airline 
services.  Thus,  as  lofac  increases,  airlines  may  be 
inclined  to  expand  their  services  to  accommodate  increased 
demand.  Such  expansion  may  then  result  in  increased  pilot 
demand.  Source:  U.S.  Federal  Aviation  Administration,  FAA 
Aviation  Forecasts . 

Major  Airline  Pilot  Retirements  -  (the  one  that  got 
away)  -  The  Future  Aviation  Professionals  of  America 
(FAPA)  recently  compiled  a  data  bank  to  track  the  number 
of  pilot  retirements  (due  to  age)  the  major  airlines  will 
experience.  Theoretically,  for  an  airline  to  maintain  its 
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current  level  of  service,  it  would  need  to  replace  these 
pilots  on  a  one-for-one  basis.  Unfortunately,  this  is  a 
forward-looking  data-base,  and  the  historical  data  goes 
only  to  1988.  Thus,  it  would  not  be  suitable  for  any 
current  modeling  purpose.  However,  FAPA  was  gracious 
enough  to  provide  the  author  with  a  list  of  this  data,  and 
it  is  included  as  Appendix  B,  for  future  use  by  other 
modelers . 

The  Dependent  Variable. 

The  variable  intended  to  be  predicted  with  the  use  of 
the  above-mentioned  variables  is  the  voluntary  retention 
rates  of  Air  Force  pilots.  A  retention  rate  is  the 
percent  of  individuals  who  remain  in  the  service  out  of 
those  who  have  the  opportunity  to  leave.  This  data  is 
maintained  by  HQAFMPC  Analysis  Division.  The  year  groups 
of  interest  are  those  which  contain  pilots  who  have 
completed  their  initial  pilot  training  obligation,  but 
have  yet  to  "commit"  themselves  to  an  Air  Force  career. 

As  discussed  in  the  previous  chapter,  current  retention 
efforts  are  aimed  at  those  pilots  in  the  7  to  14  year 
groups.  Therefore,  these  are  the  year  groups  whose 
retention  by  the  service  this  research  intends  to  predict. 

Overview  of  the  Analytical  Model 

"Regression  Analysis  is  a  statistical  tool  that 
utilizes  the  relation  between  two  or  more  quantitative 
variables  so  that  one  [the  dependent  variable]  can  be 
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predicted  from  the  other  or  others  [the  independent 
variables]"  (19:23).  Typically,  regression  analysis  is 
the  method  of  fitting  a  line  to  a  series  of  plotted  data 
points  on  a  cartesian  plane  in  such  a  fashion  that  the 
line  is  the  best  estimator  for  the  values  plotted.  Since 
this  line  can  be  stated  with  a  mathematical  formula,  it 
may  then  be  used  to  predict  future  values  of  the  dependent 
variable.  The  formula  used  to  depict  this  line  is  known 
as  the  General  Linear  Model,  and  may  be  expressed  as 
follows : 

Yj  =  Po  +  Pi  Xj  i  +  ...  +  PkXjk  +  £j 

where : 

Yj  is  the  value  of  the  dependent  variable  on  the  jth 
trial 

po ,  Pi , . . . ,pK  are  parameters  to  be  estimated 
X j i , . . . , Xj ■  are  known  constants,  the  value  of  the 
independent  variables  in  the  jth  trial 
ej  are  error  terms  (the  difference  between  the 
observed  and  predicted  value  of  Yj )  (19:31) 

Estimators  of  the  regression  parameters  are  found 
using  the  method  of  least  squares,  where,  for  each 
observation  (Xj ,  Yj ) ,  an  expected  value  is  computed. 

This  expected  value  is  then  subtracted  from  the  observed 
value  of  Yj  and  squared.  The  result  is  minimized  when 
fitting  the  regression  line,  with  use  of  the  best 
estimators  (19:38).  These  parameter  estimates  are  known 
as  regression  coefficients. 
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Assumptions  of  the  Model. 

The  following  are  basic  assumptions  of  the  GLM: 


•The  relationship  between  the  IVs  and  DV  is 
linear.  That  is  to  say,  the  magnitude  of  a  coefficient 
does  not  change  with  movement  (change)  of  its  IV. 

•Error  (£i  )  is  a  normally  distributed  random 
variable  with  a  mean  of  zero,  and  a  constant  variance 
between  observations. 

•Independence  of  ei  implies  the  errors  are 
uncorrelated. 

•The  IVs  are  statistically  independent  of  each 
other  (7:62,  19:52). 

Use  of  the  Model. 

Once  the  best  estimators  of  the  regression  parameters 
have  been  determined,  the  regression  model  may  be  written 
and  plotted.  Provided  the  predictor  variables  exhibit 
enough  pure  delay,  future  values  of  the  dependent  variable 
(pilot  retention  rates) ,  may  then  be  accurately  predicted. 
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III.  Me thodol ogy 


Aggregate  Versus  Disagggregate  Models 

Regression  models  have  shown  strong  relationships 
between  explanatory  variables  and  employee  turnover  on  an 
aggregate  level  (29:847).  However,  once  the  same  data 
used  in  an  aggregate  model  are  disaggregated,  the 
relationships  between  the  same  IVs  and  the  disaggregated 
DVs  becomes  inconclusive  (29:848).  Thus,  regression 
models  may  be  inappropriate  for  disaggregated  data. 

Data  must  be  disaggregated  to  some  extent  in  order  to 
employ  regression  techniques  (3:309).  However, 
disaggregating  to  a  level  of  detail  that  produces  no 
predictive  ability,  by  the  IVs  of  turnover,  may  result 
from  this  data  decomposition  process.  Since  Captain 
Simpson's  research  produced  a  valid  regression  model  with 
turnover  data  by  years  of  service  (YOS) ,  the  author 
developed  models  that  use  data  at  the  same  level  of  detail 
to  obtain  predicted  turnover  rates.  The  difference 
between  the  two  models  is  the  amount  of  lead  time 
provided,  thus  necessitating  an  ostensibly  different  set 
of  IVs. 

Developing  the  Basic  Regression  Model 

Construction  of  a  regression  model  is  a  highly 
iterative  process  that  results  in  estimates  the  Pj . . « 
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values  to  give  the  best  fit.  Fortunately,  computer 
packages  exist  that  run  these  iterations  in  a  fast 
and  virtually  error-free  fashion.  For  the  purpose  of  this 
research,  the  programs,  SAS  and  Statistix,  were  used  to 
compute  values  of  the  dependent  variable  and  to  perform 
tests  on  the  validity  of  the  model.  In  addition,  other 
software  ( MathCAD ,  Quattro)  was  utilized  to  provide 
random  numbers  and  graphic  capability,  respectively. 

Variable  Inclusion. 

In  chapter  two,  eleven  variables  were  identified  for 

possible  inclusion  in  the  model.  The  next  logical  step  is 

to  then  decide  which  variables  should  indeed  be  included. 

As  was  noted  earlier,  the  variables  which  should  be 

included  should  have  an  explanatory  effect  of  the  DV. 

This  is  known  as  causality. 

.. .causality ... [is  def ined] . . . such  that  X  causes 
Y  if  and  only  if  the  variance  of  the  error  [or 
the  mean  square  error,  Pierce  and  Haugh  (1977)] 
in  forecasting  Y  is  lower  if  the  information  on  X 
along  with  all  other  relevant  inforniation  is  used 
in  forecasting  Y,  compared  to  the  variance  of  the 
forecasting  error  when  knowledge  01  X  is  not  used 
in  forecasting  Y.  (12:196) 

Thus,  a  variable  should  be  excluded  if  it  does  not  reduce 
the  amount  of  variability,  and  hence,  error  the  regression 
parameters  exhibit  in  their  explanation  of  the  DV. 

Stepwise  is  a  SAS  procedure  that  automatically 
considers  an  IV' s  causality.  Using  the  MAXR  option  with 
this  procedure,  the  best  1...U  variable  model  is  built, 
with  the  criterion  for  variable  inclusion  being  whether 
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the  variable  increases  the  model’s  measure  of  R2 
(25:765).  The  MAXR  option  will  build  at  most  eleven 
models  from  a  data  set  that  contains  eleven  IVs.  Fewer 
models  will  be  built  by  the  procedure  if  no  improvement  in 
R 2  can  be  obtained  with  the  addition  of  another  IV. 

The  Coefficient  of  Multiple  Determination. 

In  regression  analysis,  R2  is  the  symbol  for  the 
coefficient  of  multiple  determination.  "It  measures  the 
proportionate  reduction  of  total  variation  in  Y  associated 
with  the  use  of  the  set  of  X  variables  ...”  (19:241). 

R2  may  range  in  value  from  zero  to  one,  with  one  being  a 
perfect  fit  of  the  IVs  to  the  DV  (where  no  error  exists) . 
The  closer  R 2  is  to  one,  the  greater  the  accuracy  of  the 
model  becomes.  As  more  variables  are  added  to  the  model, 

R 2  will  inevitably  increase,  regardless  of  the  amount  of 
variation  the  new  variable  explains  (19:241).  MAXR 
compensates  for  this  by  adding  variables  to  the  model  in 
the  order  of  which  ones  produce  the  greatest  increase  in 
R2. 

Adjusted  R2. 

Determining  which  model  to  initially  test,  nowever, 
should  not  be  based  on  the  size  of  R2,  but  adjusted  R2 
(Ra  2) .  R2  is  adjusted  by  dividing  each  of  its  components 
by  its  respective  degrees  of  freedom.  The  formula  for  Ra  2 
is : 
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R»  *  =  1 


77-1 
71  -  p 


SSE 

SSTO 


where: 

73  is  number  of  observations  associated  with  the 
IVs 

p  is  the  number  of  parameters  that  must  be 
estimated 

SSE  is  the  sum  of  the  squared  errors  between  the 
predicted  and  actual  values  of  the  DVs 

SSTO  is  the  sum  of  squared  deviations  about  the 
sample  mean  of  the  DVs  (19:236-241) 

Degrees  of  freedom  (73  -  1  and  73  -  p)  may  thus  be  seen  as  a 

tool  to  mitigate  the  effect  on  Ra  that  adding  more 

variables  to  the  model  has. 

Selecting  the  Model. 

Once  the  MAXR  results  were  returned  from  a  SAS  run, 
tl  'i  coefficients  of  model  candidates  were  examined  for 
their  significance  in  the  model.  Traditionally,  the 
levels  of  significance,  or  alpha  values,  used  in  model 
building  have  been  .1,  .05,  and  .01.  With  the  advent  of 
computers  and  the  appropriate  software  to  perform 
virtually  all  of  the  calculations  involved  in  this  highly 
iterative  process,  however,  alpha  values  are  viewed  less 
as  a  concrete  decision  tool  to  be  used  to  determine 
whether  to  accept  or  reject  a  particular  IV  for  model 
inclusion,  and  more  as  a  relative  indicator  of  the 
validity  of  the  model.  Thus,  while  smaller  alpha  values 
indicate  a  better  model  fit,  relatively  large  alphas 
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need  not  necessarily  be  viewed  as  the  sole  cause  for 
rejecting  a  model  parameter. 

Obviously,  several  adequate  models  may  present 
themselves  as  valid  candidates  for  consideration  as  the 
best  model  to  eventually  use  for  forecasting  retention 
rates.  However,  merely  choosing  the  model  that  has  the 
highest  value  of  Ra  2  is  not  the  sole  criterion  for  model 
selection.  As  previously  discussed,  econometric  modeling 
is  partly  a  science  and  partly  an  art.  The  author  views 
the  art  portion  as  the  variable  selection  process. 

Chapter  2  discussed  the  logic  behind  selecting  variables 
for  consideration.  Once  a  model  with  a  high  Ra  2  is  built, 
however,  the  variables  used  in  this  model  must  again  be 
assessed  for  their  ability  to  capture  the  synergy  of  the 
turnover  process.  Thus  a  two-variable  model  that  has  an 
Ra  2  of  .95  might  be  rejected  in  favor  of  a  four-variable 
model  that  has  a  slightly  smaller  Ra  2. 

Validating  the  Model  Internally. 

The  GLM  is  validated  internally  by  several  methods. 
The  General  Linear  Test  (GLT)  uses  the  test  statistic,  F, 
to  determine  whether  or  not  the  values  for  the  regression 
coefficients  are  zero: 

The  overall  significance  of  the  regression  can  be 
tested  with  the  ratio  of  explained  to  the 
unexplained  variance.  This  follows  an  F 
distribution  ....  If  the  calculated  F  ratio 
exceeds  the  tabular  value  of  F  at  the  specified 
level  of  significance  and  degrees  of  freedom,  the 
hypothesis  is  accepted  that  the  regression 
parameters  are  not  all  equal  to  zero  and  that  R2 
is  significantly  different  from  zero.  (24:145) 
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SAS  and  Statistix  compute  an  F-value  as  well  as  a 
probability  of  obtaining  a  larger  F  if  the  regression 
coefficients  are  indeed  zero.  This  probability  is  called 
significance  probability  (P) ,  and  provides  a  basis  for 
accepting  or  rejecting  the  model's  parameter  estimates: 
the  smaller  the  probability,  the  greater  the  validity  of 
the  model . 

Similar  tests  are  performed  for  each  parameter.  A  t- 
value  for  each  parameter  estimate  is  computed  by  dividing 
the  estimate  by  its  standard  error.  A  probability  of 
deriving  a  greater  absolute  value  than  that  of  the 
computed  t-value  if  the  parameter  were  indeed  zero  is  then 
computed.  This  research  will  refer  to  this  probability  as 
the  parameter’s  level  of  significance,  or  p-value.  Again, 
the  smaller  the  p-value  is,  the  greater  the  validity  of 
the  model  in  general,  and  the  parameter  estimate  in 
particular . 

Aptness  Analysis . 

Other  internal  tests  of  the  model  should  be 
accomplished  to  ascertain  whether  or  not  the  assumptions 
of  the  GLM  are  violated.  Such  testing  is  usually  done 
through  analysis  of  the  residuals  obtained  in  the 
regression,  and  generally  falls  under  the  title,  aptness 
analysis ,  or  testing  for  the  data's  ability  to  model  the 
DV.  According  to  Neter,  Wasserman,  and  Kutner,  there  are 
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six  departures  that  may  be  studied  to  ascertain  a  model's 
aptness : 

1)  The  regression  function  is  not  linear. 

2)  The  error  terms  do  not  have  constant  variance. 

3)  The  error  terms  are  not  independent. 

4)  The  model  fits  all  but  one  or  a  few  outlier 
observations . 

5)  The  error  terms  are  not  normally  distributed. 

6)  One  or  several  important  independent  variables 
have  been  omitted  from  the  model.  (19:116) 

As  previously  stated,  a  regression  function  is  linear 
if  the  magnitude  of  the  regression  coefficients  does  not 
change  with  the  magnitude  of  the  of  the  independent 
variables.  Thus,  as  more  data  sets  are  added  to  the 
model,  the  coefficients  should  not  appreciably  change. 
Therefore,  one  may  test  for  linearity  by  comparing 
coefficients  in  the  original  model  to  those  obtained  after 
more  data  are  added.  In  a  model  building  scenario,  this 
concept  may  be  difficult  to  diagnose,  since  testing  and 
validation,  depending  on  the  data  producing  situation,  may 
require  that  data  sets  be  withheld.  In  such  cases,  it 
must  be  assumed  that  enough  cases  were  originally  included 
in  the  model  to  provide  valid  estimates  of  the  parameters. 
The  author  will  test  for  linearity  by  comparing  the 
coefficients  obtained  with  the  twelfth  data  set,  to  those 
using  the  last  complete  data  set,  the  thirteenth. 

To  check  for  heteroschedasticity,  or  the  lack  of  a 
constant  error-term  variance,  a  plot  of  the  errors  (known 
as  residuals) ,  versus  the  predicted  values  is  helpful. 
Residuals  should  be  scattered  in  a  band  of  constant  width 
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around  a  mean  of  zero,  which  indicates  constant  error-term 
variance . 

In  Captain  Simpson’s  thesis,  heteroschedasticity  was 
addressed  by  performing  a  logarithmic  transformation  of 
the  dependent  variable.  For  the  purpose  of  simplicity, 
the  author  assumed  that  heteroschedasticity  exists,  and 
while  it  was  investigated,  remedial  measures  were  limited 
to  the  logarithmic  transformation  outlined  in  Simpson's 
methodology: 

trate  =  -  In  (UB  -  rate  +  6) 
where : 

trate  =  the  transformed  dependent  variable 
rate  =  the  voluntary  retention  rate 
UB  =  the  upper  bound  of  the  rates  (1.0) 

6  =  a  small  constant  (.001) 

The  constant,  delta,  was  determined  by  the  size  of  the 
largest  retention  rate  in  the  data.  This  occurred  in  year 
group  14,  where  the  highest  rate  observed  was  .9915.  The 
reader  is  referred  to  Captain  Simpson's  methodology  for  a 
further  discussion  of  this  variance  stabilizing  technique 
(26:33) . 

Autocorrelation  is  defined  as  the  correlation  between 
the  residual  in  one  time  period  and  that  in  the  previous 
time  period.  If  autocorrelation  is  present,  non¬ 
independence  of  the  residuals  is  implied.  This  is  common 
in  time  series  analysis,  and  may  be  detected  with  the  use 
of  a  runs  test  (19:130).  The  runs  test  performed  by 
Statistic  orders  the  standardized  residuals  by  their 
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magnitude,  and  then  plots  them  about  their  mean  (residuals 
are  standardized  by  dividing  them  by  the  mean  of  the 
regression's  squared  error  [MSE] ) .  The  number  of  runs 
(two  or  more  consecutive  values  above  or  below  the  mean) 
is  then  tabulated.  Too  few  runs  indicates  positive 
autocorrelation,  while  too  many  runs  indicates  negative 
autocorrelation  (27:8.3). 

When  a  runs  test  is  used  in  this  manner,  it  is  known 
as  a  Wald-Wolfowitz  runs  test  (4:350).  The  Wald-Wolfowitz 
test  statistic,  T,  is  the  total  number  of  runs  observed. 

A  table  of  values  for  this  statistic  must  then  be 
consulted  to  determine  if  the  residuals  are  random  (and 
hence  independent) .  For  the  sample  sizes  involved  in  this 
research,  T  should  be  greater  than  two  but  less  than  ten 
to  conclude  randomness  at  a  .05  level  of  significance 
(4:414)  . 

Outliers  are  values  of  the  DV  that  cannot  be 
accurately  fit  by  the  model.  They  may  exist  in  some 
regressions  and  remedied  by  omitting  them.  However,  a 
rationale  for  omission,  such  as  measurement  or  recording 
error,  should  exist  (20:505).  Since  the  DV  is  the 
retention  rates  of  USAF  pilots,  the  author  can  think  of  no 
reason  to  omit  outlier  observations,  if  they  do  indeed 
exist.  Therefore,  while  the  model  that  is  ultimately 
produced  may  contain  a  poor  fit  at  some  locations,  no 
remedial  action  will  be  taken  upon  such  deviations. 
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Normal  distribution  of  the  residuals  of  the 


regression  model  may  be  ascertained  by  use  of  the 

approximate  Wilk-Shapiro  statistic. 

If  the  assumptions  of  multiple  regression  are 
met,  the  standardized  residuals  should  be 
approximately  normally  distributed  with  mean  0 
and  variance  1.  The  i-th  rankit  is  defined  as 
the  expected  value  of  the  i-th  order  statistic 
for  the  sample,  assuming  the  sample  was  from  a 
normal  distribution.  The  order  statistics  of  a 
sample  are  the  sample  values  reordered  by  their 
rank  ....  The  approximate  Wilk-Shapiro 
statistic  calculated  is  the  square  of  the  linear 
correlation  between  the  rankits  and  the  order 
statistics  (Shapiro  and  Francia  1972)  .  .  .  non¬ 
normality  .  .  .  [is  indicated  by]  ...  a  small 
value  for  the  Wilk-Shapiro  statistic.  (27:8.5) 

The  analysis  in  this  research  used  a  critical  value  of 

0.9  to  test  the  approximate  Wilk-Shapiro  statistic 

obtained  in  Statistix  output  to  conclude  whether  or  not 

the  error  terms  are  normally  distributed. 

The  investigation  for  omission  of  several  important 
variables  from  the  model  requires  a  plot  of  residuals 
versus  the  independent  variables  omitted.  This  plot  would 
reveal  any  important  descriptive  power  of  these  omitted 
variables  (19:128).  Since  this  model  building  effort 
dealt  with  a  variety  of  broad  economic  indicators,  it  was 
assumed  that  the  model  generated,  based  upon  a  high 
value  of  Ra  *  (with  the  corresponding  decrease  in  model 
error),  would  not  omit  any  significant  variables.  This  is 
not  to  say  that  other  significant  variables  may  not  exist, 
rather,  that  such  variables  have  not  initially  been 
included  for  investigation  and  testing. 
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Thus,  this  research  effort’s  aptness  analysis  will 
consist  of: 

•  A  comparison  of  coefficients  to  confirm 
linearity, 

•  Residual  plots  to  reject  heteroschedasticity , 

•  The  Wald-Wolfowitz  runs  test  to  confirm  error- 
term  independence,  and, 

•  The  approximate  Wilk-Shapiro  statistic  to 
confirm  the  normal  distribution  of  the  error-terms. 

Mul ticollineari ty. 

Another  widely  used  method,  though  not  related  to 
aptness  analysis,  used  to  determine  a  model's  internal 
validity  is  to  examine  the  IVs  for  multicollinearity. 
"Multicollinearity  refers  to  the  case  in  which  two  or  more 
explanatory  variables  in  the  regression  model  are  highly 
correlated,  making  it  difficult  or  impossible  to  isolate 
their  individual  effects  on  the  dependent  variable” 
(24:182)  . 

Still,  if  the  intent  of  the  regression  is  not  to 
isolate  the  effects  of  the  IVs  on  the  DV,  but  to  predict 
the  DV,  multicollinearity  (and  the  corresponding  inability 
to  conduct  sensitivity  analysis  with  the  IVs) ,  even  though 
it  may  exist,  may  be  seen  as  a  small  penalty  if  the 
predictive  ability  of  the  model  is  of  primary  concern. 
Neter,  Wasserman,  and  Kutner  provide  an  example  of  this 
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concept,  using  two  variables  that  are  perfectly 
correlated.  They  conclude: 

The  fact  that  some  or  all  independent 
variables  are  correlated  among  themselves  does 
not,  in  general,  inhibit  our  ability  to  obtain  a 
good  fit  nor  does  it  tend  to  affect  inferences 
about  mean  responses  or  predictions  of  new 
observations  .  .  .  "(19:300). 

The  broad  nature  of  some  of  the  IVs  that  are  to  be 
investigated  necessitates  the  assumption  that 
multicollinearity  exists  between  some  of  them.  Rather 
than  use  this  relationship  as  a  basis  for  omitting 
variables  from  the  model,  the  author  invokes  the  rationale 
stated  above  as  the  basis  for  not  investigating  it 
further,  since  predictive  ability  of  the  IVs  on  the  DV 
will  not  be  impaired. 

Standardized  Variables. 

A  standardized  variable  is  one  that  has  its  sample 
mean  subtracted  from  it,  and  is  then  divided  by  the  sample 
standard  deviation.  The  resulting  value  is  the  number  of 
standard  deviations  the  non-standardized  value  lies  from 
the  sample  mean.  Thus,  a  standardized  variable  has  its 
scale,  or  unit  of  measure,  removed  and  all  variables  then 
have  a  common  unit:  their  sample  standard  deviation.  In 
multiple  regression  (regression  with  more  than  one 
independent  variable) ,  standardized  variables  are  required 
to  do  any  meaningful  sensitivity  analysis.  Since  this 
research  will  exclude  sensitivity  analysis,  and  instead 
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concentrate  on  obtaining  accurate  predictions, 
standardizing  will  not  be  performed. 

Validating  the  Model  Externally. 

External  validity  of  the  model  is  achieved  through 
accurate  prediction  of  the  dependent  variables.  This  will 
be  accomplished  by  withholding  data  sets  twelve  and 
thirteen  until  a  candidate  model  has  been  built.  This 
model  will  then  be  internally  tested  with  the  addition  of 
data  set  twelve.  If  Ra  2  remains  at  a  high  level,  and  the 
IVs  remain  significant,  then  the  model  will  be  externally 
validated  by  predicting  the  DV  in  data  set  thirteen.  If  a 
reasonably  close  prediction  is  obtained  (where  the  DV 
falls  within  the  95  percent  prediction  interval  determined 
by  Statistix  output) ,  this  data  set  will  in  turn  be  added 
to  the  model  to  generate  a  final  set  of  coefficients. 

If  the  candidate  model  fails  these  tests,  a  model 
using  a  different  combination  of  variables  will  be  sought 
and  similarly  tested.  Once  a  candidate  model  passes  these 
tests,  it  will  then  be  subjected  to  the  more  rigorous 
internal  validity  tests  discussed  in  this  chapter  before 
the  decision  to  ultimately  accept  or  reject  the  model  is 
made.  The  final  set  of  coefficients  will  then  be  used  to 
predict  retention  rates  by  YOS  in  FYs  90  to  92. 
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Meeting  the  Research  Objective 

This  model  intends  to  meet  the  research  objective  of 
predicting  Air  Force  pilot  retention  three  years  in  • 

advance,  through  the  use  of  regression  analysis 
techniques . 
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Introduction 


IV.  Findings  and  Analysis 


This  chapter  will  examine  refinements  to  the 
methodology  discussed  in  the  previous  chapter  and  the 
results  produced  by  models  built  with  the  original  and 
refined  processes. 

Initial  Model  Attempts 

The  author  originally  intended  to  closely  follow 
Captain  Simpson's  methodology  in  deriving  a  workable  model 
for  pilot  retention  in  year  groups  seven  through  fourteen. 
Captain  Simpson  was  able  to  build  one  model  to  provide 
output  for  all  year  groups  (in  his  research,  year  groups 
seven  through  eleven),  with  the  aid  of  dummy  variables. 
This  is  a  common  technique  used  in  regression  models  and 
it  worked  well  in  Captain  Simpson's  effort.  However, 
since  the  model  the  author  wished  to  produce  required  more 
pure  delay  from  a  different  set  of  independent  variables, 
he  was  not  able  to  build  a  similar  model  that  could 
produce  valid  estimators  of  the  P  coefficients  for  each 
year  group. 

Virtually  no  parameters  in  this  initial  model  were 
found  to  be  significant,  even  to  a  level  of  0.25.  The 
highest  R»  *  the  author  could  obtain  following  Simpson's 
methodology  was  .35,  which  the  author  deemed  unacceptable. 
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The  author  then  ran  the  MAXR  procedure  in  SAS  without  the 
use  of  dummy  variables.  This  necessitated  a  modeling 
effort  for  each  individual  year  group.  SAS  returned 
several  good  models  for  each  year  group.  The  author 
examined  models  based  upon  the  highest  value  of  Ra  (since 
MAXR  does  not  return  a  value  for  Ra  s) ,  with  variables  that 
exhibited  at  least  a  0.1  level  of  significance. 

These  models  were  built  using  the  first  eleven  data 
sets.  The  testing  phase,  as  discussed  in  chapter  three, 
involved  adding  the  twelfth  data  set  to  the  models,  and 
investigating  their  parameters  for  significance.  This  had 
extremely  negative  effects  on  the  model  parameters,  as  all 
of  the  models,  except  for  10  YOS,  failed  to  maintain 
parameter  significance. 

The  DVs  for  each  year  group  were  subsequently 
examined.  It  was  noticed  that  for  the  first  eleven 
observations,  most  of  the  variability  of  the  DVs  was 
confined  to  a  relatively  narrow  band,  while  the  last  two 
observations  varied  greatly  from  this  band  of  previous 
observations.  It  was  therefore  decided  to  include  data 
set  twelve  in  the  variable  selection  process,  and  once  a 
candidate  model  was  built,  to  again  check  for  significance 
of  the  parameter  estimates  produced  with  these  variables, 
using  only  the  first  eleven  data  sets.  The  thirteenth 
data  set  was  still  withheld  for  validation  purposes. 
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Deviations  From  the  Planned  Methodology 

The  author  again  used  the  SAS  procedure,  MAXR,  to 
identify  potential  models  for  testing,  but  with  the  use  of 
twelve  data  sets.  The  results  were  less  than  encouraging. 
As  may  happen  when  data  is  collinear,  a  model  with  a  high 
R 2  may  produce  parameter  estimates  that  are  not 
significant.  Similarly,  a  model  with  a  high  R2  does  not 
necessarily  guarantee  that  the  model  Ra  2  will  be 
commensurately  high.  Herein  lies  a  flaw  in  the  MAXR 
approach  to  model  building,  as  the  MAXR  output  only 
produces  the  models  with  the  highest  value  of  R2  and  these 
models  may  contain  flaws  not  discovered  until  subjected  to 
other  internal  tests.  Thus,  the  author  ended  his  use  of 
SAS  as  a  model  building  tool,  and  retained  only  the  10  YOS 
model  produced  by  the  initial  methodology. 

Statistix  software  was  thereafter  used  exclusively 
for  model  building,  testing  and  validation.  Statistix 
offered  two  main  advantages  over  SaS :  it  was  run  on  a 
personal  computer  (PC) ,  which  meant  obtaining  quicker 
results,  and  it  had  a  different  model  building  tool  for 
linear  regression,  All  Subset  Regression . 

An  all  subset  regression  produces  a  model  for  each 
possible  combination  of  variables,  regardless  of  the 
values  of  R2  or  Ra  2  that  result.  The  modeler  must  then 
choose  which  models  he  should  investigate  further,  based 
upon  the  limited  results  produced  by  the  All  Subset 
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Regression  procedure.  For  a  model  with  M  potential  IVs, 
2<m-i>  -  i  models  would  be  built.  Thus,  the  11  IVs 
researched  in  this  effort  produced  1023  subset  regressions 
for  each  year  group  modeled.  The  author  limited  his 
search  for  potential  models  to  those  that  had  the  highest 
values  of  Ra  2.  This  resulted  in  anywhere  from  ten  to 
thirty  models  to  be  investigated  further.  Again,  the 
first  twelve  data  sets  were  used,  because  of  the  problem 
with  DV  variability  encountered  earlier. 

These  models  were  then  investigated  for  significance 
of  the  regression  coefficients,  and  the  author  selected 
the  single  model  for  each  year  group  that  produced  the 
most  significant  parameters.  Data  set  twelve  was  then 
withheld,  and  the  regression  was  done  again  to  ascertain 
parameter  significance  using  only  eleven  data  sets. 

Transformation  to  Check  External  Validity 

If  the  parameters  remained  significant,  data  set 
twelve  was  added  to  the  regression,  and  the  coefficients 
produced  were  then  used  to  forecast  the  DV  in  data  set 
thirteen.  The  forecast  value  returned,  though,  was 
expressed  in  the  terms  of  the  transformed  DV,  discussed  in 
chapter  three.  To  make  the  forecast  numbers  meaningful, 
they  were  transformed  back  to  a  rate  with  the  use  of  the 
following  formula: 
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rate  =  1.001  -  (l/exp(fval) ) 
where: 

rate  =  the  forecast  value  transformed  to  a  rate 
exp  =  e  {the  base  of  the  natural  logarithm 
system)  raised  to  the  power  (x) 
fval  =  the  forecast  value  returned  by  the  model 
and  the  power  to  which  e  is  raised 

If  the  DV*s  actual  value  was  within  the  95  percent 

prediction  interval  of  the  predicted  value,  the  prediction 

was  deemed  valid. 

Accepting  or  Rejecting  the  Model 

Aptness  analysis  was  conducted  for  all  models,  first 
using  only  eleven,  then  twelve,  and  finally  all  thirteen 
data  sets.  Forecasts  were  made,  using  the  final 
coefficients  and  data  sets  fourteen,  fifteen,  and  sixteen, 
for  FYs  90,  91,  and  92  respectively. 

Generally,  if  these  tests  for  internal  and  external 
validity  were  not  passed,  a  different  model  was  then 
selected  for  testing.  However,  one  or  two  minor 
deviations  did  occur  from  these  test  specifications,  and 
in  these  cases,  the  model  that  ultimately  was  produced  was 
the  best  model  possible,  even  though  its  internal  or 
external  validity  was  below  the  level  desired.  These 
deviations  will  be  addressed  in  the  discusssion  of  the 
particular  models  in  which  they  occurred. 
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Alternate  Forecasts 


As  forecasts  were  prepared  for  each  year  group,  the 
IV  data  from  sets  fourteen  and  fifteen  were  of  no  use  in 
furthering  the  refinement  of  the  model’s  parameter 
estimates,  as  there  were  no  DVs  to  regress  them  against. 

To  include  these  data  for  that  purpose,  a  method  for 
estimating  the  DVs  for  data  sets  fourteen  and  fifteen  was 
required. 

One  may  initially  think  that  the  forecasts  themselves 
may  be  used  as  the  DV  in  these  data  sets.  However,  the 
forecast  itself  has  no  error  component,  so  the  addition  of 
these  data  sets  with  forecast  values  as  the  DVs  produces 
the  same  results  as  the  original  model.  Thus,  a  method 
for  adding  error  to  the  forecast  value  of  the  DVs  had  to 
be  established. 

This  was  accomplished  by  generating  random  numbers, 
between  zero  and  six,  with  the  use  of  MathCAD  software 
(16).  Error  in  a  regression  is  assumed  to  be  a  normally 
distributed  random  variable  with  a  mean  of  zero. 

Therefore,  three  was  subtracted  from  the  random  number 
generated,  to  represent  the  number  of  standard  deviations, 
either  above  or  below  the  mean,  the  fictitious  DV  rested 
(it  was  assumed  six  standard  deviations  encompassed  the 
entire  area  under  the  normal  curve).  (Note:  Caution 
should  be  exercised  when  generating  random  numbers  with 
MathCAD.  If  random  numbers  are  produced  using  MathCAD 
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Version  2.5,  the  same  set  of  numbers  will  be  produced  each 
time  the  program  is  initialized,  or  if  the  randomize 
command  is  used,  because  the  seed  of  the  random  number 
generator  will  be  reset  [16:172].) 

The  result  was  multiplied  by  the  standard  deviation 
of  the  error  term,  and  added  to  the  forecast  value,  to 
produce  an  estimate  of  the  DV.  This  value  was  then 
included  in  data  set  fourteen  as  the  DV,  and  the  entire 
process  was  repeated  to  produce  an  estimate  for  the  DV  for 
data  set  fifteen.  Finally,  another  regression  was  done 
with  the  fifteen  data  sets,  and  the  coefficients  produced 
by  this  regression  were  used  to  forecast  the  DV  in  FY  92, 
using  data  set  sixteen.  The  results  of  this  alternate 
methodology  are  found  under  the  title  "alternate"  in 
Appendices  E  and  H. 

Analysis  of  the  Models 

The  models  will  be  discussed  separately,  since  each 
is  unique.  The  method  of  arriving  at  the  final  models 
followed  one  of  three  paths:  1}  by  the  original 
methodology  outlined  in  chapter  three;  2)  by  the  adjusted 
methodology  outlined  in  this  chapter;  or,  3)  by  the 
adjusted  methodology  with  the  inclusion  of  other  variables 
(which  will  be  discussed  when  they  occur  in  a  particular 
model)  . 
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Appendices  C,  D,  E,  and  F  contain  data  used  to 
internally  and  externally  validate  the  models.  Appendix  C 
contains  regression  coefficients  for  the  various  model  * 

variables,  and  significance  levels  (p-values)  for  the 

« 

same.  Appendix  D  contains  values  for  the:  R2 ,  R»  2 ,  Model 
P,  T,  and  the  Approximate  Wilk-Shapiro  statistic.  The 
interpretation  of  these  values  was  discussed  in  chapter 
three.  Model  forecasts  are  in  Appendix  E.  Appendix  F 
contains  the  graphs  of  residuals  versus  predicted  values, 
to  check  for  heteroschedasticity . 

Models  Using  the  Original  Methodology. 

As  previously  stated,  the  10  YOS  model  was  the  only 
one  developed  with  the  original  methodology.  The  model 
returned  by  the  MAXR  option  consisted  of  the  following 
I Vs:  cut,  nbf,  netgrow,  perc,  and  comp.  The  model 
parameters  remained  significant  to  a  evel  of  0.01 
throughout  the  last  three  data  sets.  Aptness  analysis 
revealed  no  deviations.  The  model  predicted  a  retention 
rate  of  0.7342  in  FY  89  (using  the  first  twelve  data  sets) 
with  a  95  percent  prediction  interval  of  0.5216  to  0.8616. 

The  actual  value  was  0.6944.  The  forecast  rate  for  FY  92 
is  0.7138. 

Models  Using  Adjusted  Methodology. 

7  YOS. 

The  model  developed  to  predict  retention  within  the  7 
YOS  group  used  the  following  IVs:  cut,  lofac,  nbf. 
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netgrow,  and  comp.  All  parameters  were  significant  to  a 
level  of  .0814  in  both  the  twelve  and  eleven  data  set 
regressions.  The  best  model  obtainable  at  this 
significance  level  returned  an  Ra  2  of  only  .8073  in  the  12 
data  set  model,  though  this  increased  to  .8632  in  the 
final  model  (which  used  the  first  thirteen  data  sets) . 
Aptness  analysis  revealed  one  minor  deviation.  The 
Approximate  Wilk-Shapiro  value  of  .8959  from  the  12  data 
set  regression  did  not  meet  the  0.9  level  desired  to 
assert  error-term  independence.  This  value  returned  to 
.9349  in  the  final  model.  The  author  does  not  view  this 
deviation  significant  enough  to  reject  the  randomness 
assumption.  The  twelve  data  set  regression  predicted  a 
retention  rate  of  0.4535  for  FY  89  with  a  95  percent 
prediction  interval  of  0.0078  to  0.6992;  the  actual  rate 
was  0.4070.  The  forecast  rate  for  FY  92  is  0.4685. 

9  YOS. 

The  9  YOS  model  developed  consisted  of  the  following 
IVs:  cut,  nbf,  netgrow,  perc,  and  comp.  Parameters  were 
significant  to  a  level  of  .049  in  the  twelve  and  eleven 
data  set  regressions.  Ra  1  remained  high  through  all 
regressions,  with  a  minimum  value  of  0.9528.  Aptness 
analysis  revealed  no  deviations.  The  twelve  data  set 
regression  predicted  a  retention  rate  of  0.6851,  with  a 
prediction  interval  of  0.5089  to  0.7983.  The  actual  rate 
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for  FY  89  was  0.6374.  The  forecast  rate  for  FY  92  is 
0.6256. 

11  YOS. 

The  model  originally  produced  by  the  adjusted 
methodology  was  comprised  of  the  IVs  sales,  netgrow,  gnp, 
and  comp.  This  model  was  slightly  more  valid  internally 
and  externally  than  the  one  ultimately  selected  and 
described  below.  It  was  not  selected  because  it  was 
discovered  that  sales  not  only  included  civilian  but 
military  aerospace  orders  and  was  therefore  too  broad  of 
an  indicator  to  be  appropriate  for  inclusion  as  a 
predictor  of  the  DV.  Although  disaggregated  sales  data  is 
available  that  excludes  military  orders,  this  fact  was  not 
discovered  until  it  was  too  late  to  alter  the  data  sets 
and  perform  further  model  building  efforts. 

Lfpart,  gnp,  comp,  nbf,  and  netgrow  comprise  the  IVs 
used  to  predict  the  11  YOS  year  group.  Model  parameters 
were  significant  to  a  level  of  0.074  through  the  twelve 
and  eleven  data  set  regressions,  though  the  final  model 
saw  nbf's  significance  drop  to  0.119.  Regressions  without 
this  variable  were  executed,  but  the  resultant  model's 
internal  and  external  validity  decreased.  Therefore,  it 
was  decided  to  retain  nbf  in  the  model.  The  final  model 
produced  the  lowest  value  of  Ra  *:  0.8617.  Aptness 
analysis  revealed  no  deviations,  but  linearity  was 
investigated  by  a  secondary  method,  since  the  ceofficients 
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from  the  twelve  and  thirteen  data  set  regressions  differed 
by  a  greater  degree  than  that  shown  by  most  of  the  other 
models . 

The  secondary  method  chosen  was  suggested  by  Neter, 
Wasserman,  and  Kutner  in  their  book.  Applied  Linear 
Regression  Models  (19:118-120).  A  plot  of  the  residuals 
over  time  should  depict  the  residuals  lying  in  a 
horizontal  band  centered  around  zero.  Appendix  G  contains 
the  plots  for  the  models  that  required  the  use  of  this 
additional  test.  From  the  plot  for  11  YOS,  it  was 
concluded  that  the  IVs  were  indeed  linear  in  their 
explanation  of  the  DV. 

The  twelve  data  set  regression  produced  a  forecast  of 
0.9028,  with  a  prediction  range  of  0.7744  to  0.984.  The 
actual  rate  was  0.7911  for  FY  89.  The  forecast  rate  for 
FY  92  is  0.8769. 

14  YOS. 

The  14  YOS  model  consisted  of  the  following  IVs:  cut, 
lofac,  nbf,  netgrow,  and  comp.  Parameters  were 
significant  to  a  level  of  0.10  through  regressions  with 
data  sets  twelve  and  eleven.  However,  netgrow 's 
significance  fell  to  0.1194  for  the  final  model.  As  with 
nbf  in  the  11  YOS  model,  this  variable  was  excluded  from 
the  regression  and  the  tests  for  internal  validity 
conducted  again,  with  poorer  results  obtained.  Netgrow 
was  accordingly  reinstated  to  the  model.  R»  2  remained 
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high  through  all  regressions,  with  a  minimum  value  of 
0.9304.  Aptness  analysis  revealed  no  deviations.  The 
twelve  data  set  regression  produced  a  forecast  of  0.9528, 
with  a  prediction  interval  of  0.9105  to  0.9753.  The 
actual  value  for  FY  89  was  0.9315.  The  forecast  rate  for 
FY  92  is  0.9534. 

Models  Using  Additional  Variables. 

Three  year  groups  (eight,  twelve,  and  thirteen)  could 
not  be  predicted,  at  the  same  level  of  validity  attained 
in  the  previous  models,  solely  by  the  use  of  the  economic 
indicators  described  in  chapter  two.  Further  examination 
of  the  DVs  from  each  year  group  revealed  that  the 
retention  rates  of  some  year  groups  may  have  a  predictive 
effect  on  the  rates  of  other  year  groups.  This 
relationship  was  first  noticed  between  year  groups  eleven 
and  twelve.  The  12  YOS  model  was  by  far  the  most 
difficult  to  build,  and  is  discussed  first.  The  other 
models  are  examined  in  the  order  in  which  they  were  built. 

12  YOS. 

The  best  pure  econometric  models  of  the  twelve  year 
group  produced  an  Ra  2  in  the  0.70  to  0.79  range,  and  were 
therefore  deemed  unacceptable.  Rather  than  accept  a  low 
Ra  2 ,  the  author  decided  to  digress  from  pure  economic  IVs 
in  order  to  obtain  models  with  greater  validity.  By 
examining  the  DVs  from  the  eleven  and  twelve  year  groups, 
it  appeared  that  the  twelfth  year  group's  rates  mirrored 
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those  of  the  eleventh  year  group.  From  the  author's 
experience  as  a  pilot  in  a  strategic  airlift  squadron, 
this  made  sense,  since  he  witnessed  the  decision  making 
process  of  several  pilots  who  left  the  Air  Force,  and 
developed  the  scientifically  unfounded  belief  that  the 
decision  to  leave  is  in  part  motivated  by  peer  pressure. 
Thus,  the  decision  to  investigate  the  impact  of  peer  group 
retention  rates  was  made. 

As  mentioned  previously,  a  relationship  between  the 
twelve  and  eleven  year  group  retention  rates  seemed  to 
exist.  If  the  year  group  below  might  be  significant  in 
influencing  the  decision  to  stay  or  leave,  it  was  reasoned 
that  the  year  group  above  may  have  a  similar  impact.  It 
was  therefore  decided  to  investigate  the  possible  effects 
of  these  year  groups'  retention  rates  on  retention  in  the 
twelve  year  group. 

Defining  the  IVs  necessitated  some  creative  data 
management.  Initially,  the  rates  of  the  peer  groups 
themselves  were  intended  to  be  used.  However,  this  method 
has  a  major  flaw:  since  the  models  are  meant  to  predict 
retention  three  years  ahead,  the  retention  rates  of  the 
peer  groups  would  not  be  known.  Therefore,  forecar*  "5  for 
the  last  three  complete  data  sets  (eleven,  twelve,  and 
thirteen)  were  used  in  place  of  the  actual  rates.  In 
addition,  the  rates  themselves  were  not  used  as  the  IVs, 
rather,  the  transformed  rates,  as  described  in  chapter 
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three  were  required.  Regressions  using  both  the  original 
rates  and  transformed  rates  were  attempted,  and  the  models 
using  transformed  rate  IVs  produced  results  within  the 
range  of  valid  predictions  (zero  to  one) ,  while  the 
original  (untransformed)  rate  IVs  did  not. 

While  forecast  rates  were  used  in  the  last  three  data 
sets  to  build  the  models,  the  actual  rates  were  used  for 
validation  and  forecasting  purposes.  Since  the  actual 
rates  were  indeed  known,  in  order  to  produce  the  best 
possible  forecasts,  it  did  not  make  sense  to  withhold  them 
from  the  entire  process. 

Three  12  YOS  models  were  produced.  12  YOS  "A"  used 
one  IV,  tratell  (the  transformed  retention  rate  for  year 
group  11) .  12  YOS  "B”  is  comprised  of  tratell  and  tratel4 

(the  transformed  retention  rate  for  the  fourteen  year 
group) .  While  the  original  intent  was  to  use  the 
transformed  rate  from  the  thirteen  year  group,  it  was 
decided  that,  since  the  13  YOS  model  itself  required  the 
use  of  an  adjacent  year  group's  retention  rate  as  an  IV 
(the  fourteenth),  using  the  fourteen  year  group’s 
transformed  rate  would  produce  models  with  greater 
validity.  12  YOS  "C," ,  a  model  using  a  combination  of 
economic  IVs  and  a  retention  rate  IV,  was  arrived  at  using 
tratell,  lfpart,  gnp,  and  lofac. 

Perhaps  the  reader  may  be  able  to  discern  the  best  12 
YOS  model  from  the  explanations  below  or  from  examining 
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the  test  data.  The  author,  however,  is  reluctant  to 
flatly  state  which  model  is  "best."  Each  has  it  strengths 
and  weaknesses,  which  is  why  the  three  are  presented. 

12  YOS  ”A.  " 

The  simplest  of  the  three  models,  12  YOS  ”A"  is  the 
most  internally  valid  as  well.  The  single  variable, 
tratell,  produced  phenomenal  results.  Parameters  were 
significant  to  a  level  of  0.0  through  all  regressions, 
while  Ra  2  had  a  minimum  value  of  0.8448.  Aptness  analysis 
revealed  no  deviations.  The  twelve  data  set  regression 
predicted  a  retention  rate  of  0.7817  for  FY  89  (with  a 
prediction  interval  of  0.4877  to  0.9073).  The  actual  rate 
was  0.5957.  The  forecast  rate  for  FY  92  is  0.8712. 

12  YOS  "B. " 

12  YOS  "BM  is  the  least  internally  valid  of  the  three 
models.  Tratel4's  best  level  of  significance  did  not 
occur  until  the  final  model  and  then  it  was  still  0.2738. 
Tratell  remained  significant  (0.0048)  through  all 
regressions.  Ra  2  peaked  at  0.8816  in  the  final  model. 
Aptness  analysis  revealed  no  deviations,  though  linearity 
was  tested  with  the  method  described  in  the  11  YOS  model. 
The  forecast  from  the  twelve  data  set  regression  was 
0.7572  for  FY  89  with  an  interval  of  0.3468  to  0.9101. 

The  forecast  rate  for  FY  92  is  0.8482. 
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12  YOS  "C. 


This  model  produced  the  best  external  validity, 
though  internal  tests  were  somewhat  mixed.  Parameters 
were  significant  to  0.0537  in  the  final  model,  but  the 
significance  of  lfpart,  gnp,  and  lofac  was  less  in  the 
twelve  and  eleven  data  set  regressions,  with  all  values 
ranging  between  0.2  and  0.1.  Ra  2  went  from  0.782  in  the 
eleven  set  regression  to  0.9078  in  the  final  model. 

Aptness  analysis  revealed  no  deviations.  The  forecast  for 
FY  89  using  twelve  data  sets  was  0.5798  with  an  interval 
of  0.0  to  0.8832.  The  forecast  rate  for  FY  92  is  0.7707. 

13  YOS. 

The  13  YOS  model  contained  the  following  IVs:  lfpart, 
lofac,  perc,  mttot,  and  tratel4.  Parameters  were 
significant  to  0.0791  through  regressions  with  twelve  and 
eleven  data  sets.  However,  mttot  fell  to  a  significance 
of  0.1946  in  the  final  model.  Regressions  without  mttot, 
and  various  combinations  of  its  elements  (cut,  form), 
yielded  no  better  results,  so  mttot  was  retained  at  the 
expense  of  the  model’s  internal  validity.  Aptness 
analysis  revealed  no  deviations,  though  linearity  was 
confirmed  via  the  method  used  with  the  11  YOS  model.  Ra  3 
fell  from  a  high  of  0.9519  in  the  eleven  set  regression, 
to  0.8424  in  the  final  model.  The  twelve  set  regression 
produced  a  forecast  of  0.7931  for  FY  89,  with  an  interval 
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of  0.5357  to  0.9081.  The  actual  rate  was  0.9014.  The 
forecast  rate  for  FY  92  is  0.7999. 

8  YOS. 

The  model  for  the  eight  year  group  was  the  last  one 
produced.  Initially,  a  fairly  good  model  (cut,  form, 
lfpart,  perc,  and  comp)  was  built  without  the  use  of  a 
"peer  group"  retention  rate.  However,  this  model's 
external  validity  test  overestimated  retention  in  FY  89  by 
0.1028,  and  although  this  was  within  the  95  percent 
prediction  interval,  the  author  viewed  overestimations  of 
this  magnitude  as  undesirable.  Hence,  a  peer  group 
retention  rate  was  used  to  obtain  greater  prediction 
accuracy.  The  only  other  pure  economic  model  that 
produced  a  larger  overestimation  was  11  YOS  (0.1117) ,  but 
since  it  was  significant  in  the  prediction  of  the  twelve 
year  group  as  well,  no  attempt  to  refine  11  YOS  through 
the  use  of  peer  group  variables  was  made. 

The  8  YOS  model  ultimately  built  contained  the  IVs 
cut,  nbf,  gnp,  comp,  and  trate7  (the  transformed  retention 
rate  from  year  group  seven) .  Parameters  were  significant 
to  a  level  of  0.1066  in  the  twelve  set  regression.  The 
eleven  set  regression  saw  nbf's  significance  drop  to 
0.2523,  which  was  initially  viewed  as  a  cause  for 
rejection.  However,  tests  without  nbf,  and  substituting 
all  other  variables  produced  either  similar  or  worse 
results  in  the  eleven  and  twelve  data  set  regressions. 
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Therefore,  nbf  was  retained,  though  it  detracts  from  the 
model's  internal  validity.  The  final  model  saw  all 
variables  significant  to  a  level  of  0.0855.  R»  *  remained 
high  over  all  data  set  regressions,  with  a  minimum  value 
of  0.9257.  Aptness  analysis  revealed  no  deviations.  The 
twelve  set  regression  produced  a  forecast  rate  of  0.5051, 
with  an  interval  of  0.2224  to  0.6852.  The  actual  rate  for 
FY  89  was  0.4741.  The  forecast  for  FY  92  is  0.5136. 

Summary 

Adjustments  to  the  original  methodology  were  required 
to  produce  statistically  valid  models  of  year  groups  seven 
through  fourteen.  Some  year  groups  could  not  be 
adequately  predicted  through  economic  relationships  alone 
and  thus  peer  group  retention  rates  were  required  to  help 
explain  retention. 
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V.  Conclusions  and  Recommendations 


Introduction 

This  chapter  will  examine  the  practical  implications 
of  the  research  results,  the  policy  implications  for 
management,  and  recommendations  for  model  refinement. 

Practical  Implications  of  the  Results 

A  model  for  each  year  group,  from  seven  to  fourteen 
YOS  was  produced.  These  models  demonstrated  the  ability 
to  forecast  pilot  retention  rates  three  years  ahead,  using 
statistically  significant  IVs.  While  perhaps  models  that 
contained  greater  internal  validity  could  have  been 
produced  using  all  thirteen  data  sets,  their  external 
validity  would  have  been  unknowable. 

Variable  Analysis . 

Since  multicollinearity  was  not  investigated, 
sensitivity  analysis  is  not  possible.  The  sign  of  the 
regression  coefficients  in  the  models  produced,  however, 
may  be  an  indication  the  validity  of  these  variables  as 
predictors.  Three  variables:  nbf ,  gnp,  and  comp,  did 
perform  as  postulated  in  chapter  two.  The  sign  of  the 
regression  coefficients  of  nbf  and  gnp  was  negative, 
indicating  that  movement  of  these  variables  contributed  to 
the  opposite  movement  of  the  DVs  three  years  later.  The 
variable,  comp,  had  a  positive  value,  which  suggests  that 
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as  it  decreases  in  value,  so  does  the  DV  it  helps  to 
predict. 

What  is  puzzling  is  that  virtually  all  of  the  other 
IVs  introduced  in  chapter  two  and  ultimately  used  in  the 
final  models  (cut,  lofac,  netgrow,  perc,  and  mttot)  did 
not  behave  in  the  manner  anticipated.  For  example,  the 
variable  cut  was  significant  in  five  models  and  had  a 
positive  value.  The  positive  value  suggests  that  as  metal 
cutting  machine  tool  orders  increased  in  any  one  year,  the 
effect  on  the  movement  of  the  DV  was  positive  three  years 
later . 

The  reason  for  this  phenomenon  is  unknown,  but  may  be 
due  to  one  or  more  of  the  following  hypotheses: 

•The  author’s  intuitive  interpretation  of  lagging 
these  variables  for  a  period  of  three  years  may  have  been 
incorrect . 

•The  actual  movement  of  these  IVs  may  have  a 
greater  short-term  impact  on  the  DV  than  the  relatively 
long-term  (three  years)  this  research  encompassed. 

•Since  multiple  regression  analysis  examines  the 
combined  effect  of  a  set  of  IVs  on  the  DV,  interpretation 
of  the  signs  of  the  regression  coefficients  may  be  more 
difficult . 

•If  multicollinearity  exists,  then  any 
interpretation  of  the  sign  of  the  coefficient  may  be 
meaningless . 
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•Other  reasons  not  postulated  by  the  author. 

Another  tool  used  to  analyze  variables  is  the 
comparison  of  their  regression  coefficients.  The 
magnitude  of  these  coefficents  reveals  the  relative 
importance  of  the  various  variables  in  a  model.  However, 
to  apply  this  technique,  the  variables  must  be 
standardized,  as  discussed  in  chapter  three.  Early  in 
this  research,  it  was  decided  that  the  predictive  ability 
of  the  models  was  of  prime  importance.  Hence,  sensitivity 
analysis  and  the  relative  importance  of  the  variables  in  a 
particular  model  were  not  deemed  to  be  within  the  scope  of 
the  research.  In  retrospect,  such  analyses  may  be  of 
interest  to  users  of  these  models,  and  will  therefore  be 
included  as  suggested  model  refinements. 

The  Mobley  Paradigm. 

That  this  research  produced  statistically  valid 
models  is  but  the  first  step  in  reversing  the  trend  in 
pilot  retention  the  Air  Force  is  experiencing.  The 
information  these  forecasts  will  provide  should  signal 
leadership  as  to  whether  or  not  previous  trends  will 
continue.  Using  the  Mobley  paradigm  for  turnover 
management,  these  models  should  allow  Air  Force  leaders  to 
anticipate  the  turnover  event.  If  the  forecasts  are 
unacceptably  low,  policies  and  programs  should  be 
developed  to  turn  the  forecasts  into  a  self -negating 
prophecy  (14:271).  Forecasts  of  this  nature  will  allow 
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leadership  to  develop  proactive  rather  than  reactive 
policies  and  programs  to  address  pilot  retention. 

Policy  Implications  for  Management 

The  models  produced  by  this  research  generally  show 
either  a  leveling  or  decrease  in  retention  for  FY  90, 
followed  by  an  increase  in  FY  91,  then  a  decrease  in  FY  92 
to  a  level  lower  than  the  rate  traditionally  enjoyed  or 
desired.  It  is  known  that  strong  airline  hiring  will 
continue  for  at  least  the  remainder  of  this  decade,  having 
a  negative  influence  on  retention.  Thus,  policies  should 
be  developed  not  only  to  attenuate  the  "siren"  effect  this 
hiring  has  on  those  who  are  already  pilots,  but  to  recruit 
those  who  will  be  less  influenced  by  the  siren's  lure. 

The  Systems  Perspective. 

The  author  believes  policies  should  be  pursued  that 
will  take  advantage  of  the  interaction  an  open  system  must 
have  with  its  environment  in  order  to  survive.  The  Air 
Force  leader  needs  to  be  able  to  take  the  chaos  presented 
to  him  in  the  form  of  poor  pilot  retention  and  diminished 
resources,  and  use  them  to  his  advantage.  To  illustrate, 
when  a  leader  is  handed  "lemons,"  he  should  have  the  tools 
and  flexibility  at  his  disposal  to  make  "lemonade." 

"If  It  Ain’t  Broke ,  Don’t  Fix  It.” 

The  author  has  received  input  from  some  military 
sources  who  have  suggested  that  lemonade  will  be  made  of 
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the  budget  cut  and  pilot  shortage  lemons  currently  being 
served  by  the  Air  Force's  environment.  Theoretically, 
budget  cuts  could  drive  force  levels  so  low  as  to  make  the 
pilot  retention  problem  evaporate.  Magically,  retention 
would  be  at  the  level  required.  Changes  in  the  personnel 
system  would  not  be  required  since  the  problem  would 
disappear.  However,  Peters  believes  in  order  for 
organizations  to  remain  competitive,  "if  it  ain't  broke, 
you  just  haven't  looked  hard  enough.  Fix  it  anyway” 
(23:1).  Today's  lemons  should  not  be  viewed  as  lemons  at 
all,  but  opportunities  to  make  the  far-reaching  proactive 
changes  tomorrow's  Air  Force  requires. 

Policy  Ideas. 

Often,  it  is  an  idea  that  is  thought  of  as  dumb  or 
impractical  that  eventually  yields  the  desired  results 
(23 : 529-5'*3)  .  in  the  course  of  this  research,  several 
policy  ideas  occurred  to  the  author  that  may  be  of  some 
value  to  those  who  are  poised  to  make  the  sweeping  changes 
in  the  Air  Force  that  are  on  the  planning  horizon.  As  the 
Air  Force  enters  an  era  filled  with  uncertainty  and  chaos, 
"business  as  usual”  will  not  likely  suffice.  The  "dumb" 
ideas  should  be  at  least  aired;  the  "dumb"  questions 
should  be  asked. 

Flexibility. 

Flexibility  is  the  key  to  airpower.  The  concept  of 
flexibility  is  not  foreign  to  any  military  leader  when  it 
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comes  to  the  strategy  and  tactics  of  warfare.  Yet,  when 
it  comes  to  personnel  management  in  the  Air  Force, 
flexibility  has  been  eroded  by  attempts  to,  among  other 
things,  control  pilot  retention.  Open  systems  are  stable, 
while  relatively  closed  systems  are  less  so.  Thus,  as 
attempts  are  made  to  control  retention  by  increasing  ADSCs 
for  UPT,  permanent  change  of  station  (PCS)  moves,  and 
qualifying  in  a  different  aircraft,  they  may  have  the 
opposite  effect  than  that  intended:  turnover  rates  could 
eventually  increase.  Incurring  a  ten  year  commitment  for 
UPT  graduation  seems  excessive  and  will  possibly  produce 
the  opposite  effect  of  that  intended.  While  a  ten-year 
ADSC  will  eradicate  retention  problems  during  an  officer's 
first  ten  years  of  duty  after  UPT,  when  he  eventually 
becomes  eligible  to  leave,  he  may  be  more  inclined  to  do 
so  than  if  the  ADSC  had  not  been  so  lengthy. 

The  Peer  Group  Effect. 

Of  significance  in  this  research  is  the  discovery  of 
the  peer  group  effect  discussed  in  chapter  four.  Assuming 
this  effect  among  pilots  does  indeed  exist,  it  follows 
that  when  the  intent  to  quit  is  formulated  by  one  pilot, 
it  will  in  turn  have  an  impact  on  the  intents  of  his 
contemporaries.  Thus,  if  a  pilot's  decision  to  leave  the 
Air  Force  is  made  after  having  served  five  years,  for 
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example,  he  will  have  five  more  years  to  influence  the 
intents  of  others  who  may  not  be  so  disposed. 

General  systems  theory,  when  applied  to  the  case  of 
ADSCs,  might  follow  a  different  path  to  turnover 
management.  Rather  than  increase  ADSCs  for  UPT,  they 
might  be  reduced  or  eliminated  altogether.  A  "no 
commitment  for  UPT"  approach  would  increase  the  personnel 
system’s  interaction  with  the  environment,  and  therefore 
move  it  towards  stability.  On  the  surface,  this  idea  may 
sound  silly,  but  it  would  reduce  the  impact  of  the  peer 
group  effect.  Those  not  inclined  to  serve  would  leave 
when  the  decision  is  made,  when  they  have  less  influence 
over  their  peers.  Those  who  stay  will  have  made  their  own 
decision  to  stay,  having  found  their  own  reasons  for 
reinforcing  this  decision.  In  so  doing,  the  positive 
group  norm  that  is  required  for  healthy  retention  will 
theoretically  evolve. 

While  reducing  or  eliminating  ADSCs  from  UPT  may 
decrease  the  negative  effects  of  peer  pressure,  it  will 
not  reduce  one’s  susceptibility  to  that  pressure.  The 
United  States  Air  Force  Academy  (USAFA)  supplies  the  bulk 
of  UPT  candidates  in  any  given  year.  Among  the  many  tools 
used  to  shape  the  behavior  of  these  candidates  while  they 
are  undergraduates  is  a  peer  rating  system  that  is 
known  as  the  Leadership  Attribute  Survey.  This  system 
currently  allows  each  cadet  to  rate  the  lowest  four  cadets 
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and  the  single  top  cadet  in  his  squadron  from  his  class 
and  all  classes  below.  The  results  are  then  briefed  to 
each  cadet  individually  by  his  Air  Officer  Commanding 
(AOC) ,  with  recommendations  for  improvement  if  poor  marks 
are  attained.  Such  a  system  tends  to  make  the  cadet  more 
aware  of  what  his  peers  think  of  him,  or  ma^  think  of  him 
if  he  exhibits  behavior  that  deviates  from  the  norms  of 
the  group. 

Other  officer  accession  programs  may  have  similar 
formal  tools  for  "leveling"  group  norms.  The  author 
believes  such  devices  may  have  the  unintended  effect  of 
making  each  officer  more  susceptible  to  peer  pressure. 

This  may  in  turn  help  to  explain  the  peer  group  effect. 
Since  any  group  will  inevitably  establish  norms, 
regardless  of  the  formal  system  that  exists  to  instill 
them,  the  author  suggests  that  the  greater  good  may  be 
served  by  overhauling  or  eliminating  the  devices  used  by 
the  Air  Force  to  instill  group  norms  in  its  training 
environments . 

The  Rated  Supplement . 

The  assertion  made  in  the  first  chapter  that  there 
can  be  functional  turnover  in  the  pilot  career  field,  is 
only  partially  correct.  Certainly,  some  turnover  may 
occur  that  will  not  damage  the  Air  Force's  present  war 
fighting  capability.  Yet,  when  dysfunctional  turnover 
occurs  (as  it  is  presently) ,  there  are  fewer  pilots  in  the 
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rated  supplement  to  be  drawn  upon  to  sustain  readiness, 
because  turnover  previously  thought  of  as  functional 
existed.  Therefore,  most  pilot  turnover  contains  an 
element  of  dysfunctionality .  If  more  pilots  were 
retained,  regardless  of  the  desired  retention  rate,  more 
would  be  available  to  man  cockpit  vacancies  created  when 
retention  dropped  to  unacceptable  levels.  Thus,  we  see 
the  beauty  and  logic  of  a  healthy  rated  supplement 
program. 

Viewing  some  turnover  as  functional  has  another 
potential  drawback:  it  creates  a  smaller  pool  of 
prospective  leaders.  The  Air  Force  draws  upon  the  rated 
force  for  the  bulk  of  its  senior  leadership.  Since  the 
pool  of  potential  leaders  is  reduced  when  functional 
turnover  occurs,  it  may  be  seen  as  a  limiting  factor  on 
the  future  availability  of  leadership  resources. 

Sabbatical . 

Universities  provide  their  professors  with 
opportunities  to  take  extended  breaks  from  their  official 
duties  to  pursue  studies  that  will  further  their  knowledge 
in  their  area  of  expertise.  These  breaks  are  known  as 
sabbaticals.  The  author  believes  a  similar  opportunity 
should  be  extended  to  every  career  officer.  Those  who 
have  achieved  "career"  status  (for  example,  selection  for 
promotion  to  Major  or  Lieutenant  Colonel) ,  should  be 
afforded  an  opportunity  to  pursue  an  independent  study 
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program  that  will  be  responsible  for  nothing  more  than 
furthering  the  personal  development  and  growth  of  the 
officer. 

The  sabbatical  concept  has  two  chief  aspects  that 
should  positively  influence  retention.  It  provides  an 
incentive  for  pilots  to  remain  on  active  duty  to  reach  the 
point  where  the  sabbatical  may  be  taken.  It  further 
motivates  these  officers  by  allowing  them  to  develop  their 
own  personal  development  programs.  In  return,  the  service 
would  most  likely  receive  refreshed,  more  productive 
officers  upon  their  resumption  of  active  service. 

Recrui ting. 

Since  the  service  cannot  recruit  its  senior  military 
leadership  from  the  civilian  work  force,  the  quality  of 
the  officer  candidates  recruited  in  any  one  year  reflects 
directly  upon  the  quality  of  its  leadership  years  later. 
Therefore,  recruiting  the  right  personnel,  while  no  easy 
task,  is  important. 

The  task  of  transforming  raw  recruits  into 
committed  stars,  able  to  cope  with  the  pace  of 
change  that  is  becoming  normal,  begins  with  the 
recruiting  process  ....  The  best  .  .  .  insist 
that  line  people  dominate  the  process  .... 
(23:379) 

Thus,  Air  Force  leadership  may  wish  to  consider  using 
flying  squadron  commanders  actively  in  the  recruiting 
process . 
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Other  Ideas. 

Other  policies  and  programs  that  may  have  some 
positive  effect  on  pilot  retention  might  include: 

•an  airline  job  placement  program  for  those  pilots 
who  have  formally  declared  their  intention  to  quit, 

•a  rating  system  for  flying  unit  commanders  that 
includes  a  measure  of  his  unit's  pilot  retention  rate, 

•a  retention  program  that  is  a  continuous  effort 
over  the  period  of  an  officer's  usefulness  to  the  service, 
•a  recruiting  program  that  emphasizes  the 
philosophical  aspects  of  military  service,  and, 

•an  assessment  of  the  impact  upon  active  duty 
pilot  retention  that  the  availability  of  Air  National 
Guard  and  Air  Force  Reserve  pilot  positions  may  have. 

The  author  recognizes  that  some  of  the  ideas 

presented  in  this  chapter  may  be  viewed  as  being  tainted 

with  "ivory  tower"  idealism.  However,  if  business  as 

usual  is  the  order  of  the  day,  even  the  best  leadership 

may  fail  to  positively  impact  pilot  retention.  As  Warren 

Bennis  wrote  in  The  Leadership  Challenge,  reflecting  on 

his  time  as  President  of  the  University  of  Cincinnati, 

My  moment  of  truth  came  toward  the  end  of  my 
first  ten  months.  It  was  one  of  those  nights  in 
the  office.  The  clock  was  moving  toward  four  in 
the  morning,  and  I  was  still  not  through  with  the 
incredible  mass  of  paper  stacked  before  me.  I 
was  bone  weary  and  soul  weary,  and  I  found  myself 
muttering,  "Either  I  can't  manage  this  place,  or 
it's  unmanageable."  I  reached  for  my  calendar 
and  ran  my  eyes  down  each  hour,  half-hour, 
quarter-hour  to  see  where  my  time  had  gone  that 
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day,  the  day  before,  the  month  before... My 
discovery  was  this:  I  had  become  a  victim  of  a 
vast,  amorphous,  unwitting,  unconscious 
conspiracy  to  prevent  me  from  doing  anything 
whatever  to  change  the  university's  status  quo. 
(23:497) 

In  the  uncertain  times  ahead,  maintaining  the  status  quo 
is  not  likely  to  ensure  that  the  Air  Force  can  attract  and 
rett' in  the  quality  people  it  will  require. 


Recommendations  for  Model  Improvement 

When  this  research  effort  began,  its  original 
objective  was  to  produce  a  model  that  could  predict 
pilot  retention  not  only  by  YOS  but  also  by  weapon  system 
(the  type  of  aircraft  flown) .  As  the  author  became 
embroiled  in  the  model  building  effort,  the  research  was 
scaled  back  to  include  only  the  former.  Thus,  these 
models  may  be  improved  by  providing  estimates  of  weapon 
system  retention  rates.  The  author's  intent  was  to 
accomplish  this  by  simply  using  historical  probabilities. 
More  accurate  predictions  might  be  obtained,  however, 
through  the  use  of  a  logit  regression  procedure. 

Some  of  these  models  have  parameters  with 
significance  levels  greater  than  0.1,  which  the  reader  may 
view  as  a  cause  for  rejecting  the  variable  from  the  model. 
In  the  models  where  the  parameter  significance  grew  over 
the  three  data  set  span  that  was  used  for  testing  and 
validation,  other  modelers  may  eventually  wish  to  reject 
such  variables.  Since  the  retention  rates  for  FY  90  will 
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be  known  in  October  of  this  year,  it  is  recommended  that 
the  models  again  be  validated  using  these  new  rates. 

The  models  produced  by  this  methodology  did  not 
address  multicollinearity  per  se,  rendering  sensitivity 
analysis  impossible.  To  further  the  utility  of  these 
models,  multicollinearity  should  be  investigated.  Since 
it  is  assumed  to  exist  in  some  of  the  models,  ridge 
regression  techniques  may  be  applicable  as  a  remedial 
measure . 

As  discussed  earlier  in  this  chapter,  the  magnitude 
of  the  regression  coefficients  may  be  a  useful  analytical 
tool  to  some  model  users.  To  perform  this  analysis,  the 
IVs  for  each  model  should  be  standardized.  The  regression 
for  each  model  should  be  subsequentlv  ' ^accomplished  and 
the  coefficients  returned  could  then  be  analyzed  for 
relative  importance. 

Recommendations  for  Further  Research 

Perhaps  the  most  significant  revelation  this  research 
produced  was  the  effect  that  forecast  retention  rates  of 
some  year  groups  had  on  other  year  groups.  This  effect 
should  be  investigated  further.  It  may  exist  across  all 
year  groups.  A  related  research  area  would  be  to 
investigate  the  true  causes  of  the  peer  group  effect,  if 
it  does  indeed  exist. 
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The  author  recommends  that  multicollinearity  be 
investigated,  and  remedial  measures  attempted,  in  the 
models  where  it  is  found  to  exist.  These  new  models  may 
then  be  revalidated  using  the  FY  90  DV  data  that  will  be 
available  in  October  1990.  New  forecasts  may  then  be  made 
and  compared  to  the  forecasts  produced  in  this  research  to 
assess  the  true  impact  that  multicollinearity  has  on  the 
predictive  ability  of  the  models.  In  addition,  once 
remedial  measures  for  multicollinearity  have  been  taken, 
sensitivity  analysis  will  then  be  possible. 

A  better  method  for  introducing  the  error  required 
for  the  alternate  forecasting  method  may  be  available. 

This  research  used  the  MathCAD  random  number  generator  and 
an  assumption  that  six  standard  deviations  encompassed  the 
total  area  under  the  normal  curve.  Perhaps  there  are 
existing  methodologies  that  accomplish  this  error 
introduction  in  i.  more  statistically  valid  manner. 

Other  variables  may  have  better  predictive  ability  of 
the  DV  than  those  used  in  this  research.  Besides  the 
civilian  aerospace  sales  variable  that  was  discussed  in 
chapter  four,  oil  prices  or  aviation  fuel  prices  may  have 
some  significant  predictive  ability.  Suggested  sources 
for  other,  possibly  significant,  variables  are  included  in 
Appendix  I . 
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Summary 


A  set  of  models  that  forecast  pilot  retention  rates 
three  years  in  advance,  for  year  groups  seven  through 
fourteen,  was  constructed.  Statistically  significant 
economic  variables  were  primarily  used  to  build  these 
models.  Some  models,  however,  required  the  use  of  peer 
group  variables  to  adequately  explain  their  DVs. 

These  models  will  supply  Air  Force  leaders  with  the 
ability  to  anticipate  retention  rates  33  months  in  advance 
of  when  the  actual  rates  become  known.  Armed  with  the 
forecasts  of  future  rates,  policies  and  programs  may  then 
be  developed  to  mitigate  any  forecast  rates  that  are 
deemed  too  low.  Once  these  policies  and  programs  have 
been  implemented,  their  effectiveness  should  be  evaluated 
through  the  use  of  actual  retention  data. 
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Appendix  A:  Data  Sets 


IV  Data  Sets 

set  year  cut  form 

1  74  3735.2  1048.1 

2  75  1544.7  608.77 

3  76  2633.9  875.37 

4  77  3271.9  1178.3 

5  78  4671.7  1347.6 

6  79  5718.8  1432.6 

7  80  4533.3  1015.2 

8  81  2370.2  762.77 

9  82  1064  433 

10  83  1108.8  524.54 

11  84  1779  928.51 

12  85  1670.9  608.66 

13  86  1355.6  510.1 

14  87  1232.8  566.7 

15  88  2232.5  727.9 

16  89  1564.5  658.75 

set  year  nbf  netgrow 

1  74  111  9.4 

2  75  108.8  10 

3  76  117.2  9.6 

4  77  130.8  10.4 

5  78  138.1  10.7 

6  79  138.3  11.5 

7  80  129.9  11.5 

8  81  124.8  10.3 

9  82  116.4  10 

10  83  117.5  9.5 

11  84  121.3  9.5 

12  85  120.9  9.7 

13  86  120.4  9.6 

14  87  120.9  9.7 

15  88  124.1  9.8 

16  89  124.8  10.1 


acship 

lfpart 

lof  ac 

sales 

263 

61.3 

54.9 

57.703 

285 

61.2 

53.7 

48.904 

238 

61.6 

55.4 

57.D52 

159 

62.3 

56.2 

56.166 

241 

63.2 

61.5 

68.975 

376 

63.7 

63 

82.952 

383 

63.8 

59 

82.147 

388 

63.9 

58.6 

77.34 

236 

64 

59 

84.9 

262 

64 

60.7 

88.162 

188 

64.4 

59.2 

97.4 

273 

64.8 

61.4 

100.09 

329 

65.3 

60.3 

97.278 

261 

65.6 

62.3 

101.19 

422 

65.9 

62.5 

121.27 

445 

66.5 

63 

117.24 

perc 

gnp 

comp 

0.4484 

54 

0.977 

0.4542 

59.3 

0.974 

0.4508 

63.1 

0.954 

0.4619 

67.3 

0.957 

0.4658 

72.2 

0.939 

0.4699 

78.6 

0.932 

0.4758 

85.7 

0.954 

0.4802 

94 

0.998 

0.486 

100 

0.96 

0.4906 

103.9 

0.946 

0.4952 

107.7 

0.936 

0.4996 

110.9 

0.923 

0.5039 

113.9 

0.912 

0.5074 

117.7 

0.899 

0.5095 

121.3 

0.905 

0.5121 

126.3 

0.896 
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DV  Data  Sets 


set 

year 

7YOS 

8YOS 

1 

77 

0.848 

0.9313 

2 

78 

0.7505 

0.8091 

3 

79 

0.6374 

0.6937 

4 

80 

0.6675 

0.7992 

5 

81 

0.7925 

0.8418 

6 

82 

0.8521 

0.9033 

7 

83 

0.8745 

0.9429 

8 

84 

0.8024 

0.9018 

9 

85 

0.7494 

0.8143 

10 

86 

0.7243 

0.7918 

11 

87 

0.6059 

0.7237 

12 

88 

0.4702 

0.63 

13 

89 

0.407 

0.4741 

set 

year 

13YOS 

14YOS 

1 

77 

0.9743 

0.9867 

2 

78 

0.9745 

0.9758 

3 

79 

0.9469 

0.9737 

4 

80 

0.9522 

0.9749 

5 

81 

0.9808 

0.9875 

6 

82 

0.9895 

0.9948 

7 

83 

0.9617 

0.9915 

8 

84 

0.9573 

0.9929 

9 

85 

0.955 

0.9864 

10 

86 

0.9864 

0.9816 

11 

87 

0.967 

0.972 

12 

88 

0.9413 

0.9603 

13 

89 

0.9014 

0.9315 

9YOS 

10YOS 

11YOS 

12YOS 

0.9513 

0.9515 

0.9684 

0.9692 

0.8718 

0.9343 

0.9487 

0.9655 

0.7717 

0.8126 

0.8733 

0.893 

0.8262 

0.8976 

0.9244 

0.9187 

0.89 

0.9156 

0.9585 

0.9781 

0.94 

0.9591 

0.9681 

0.9728 

0.9616 

0.9782 

0.9747 

0.9841 

0.9479 

0.9632 

0.9632 

0.9674 

0.8634 

0.8782 

0.9344 

0.9564 

0.8262 

0.8475 

0.8762 

0.9365 

0.7875 

0.8597 

0.8559 

0.852 

0.7423 

0.7756 

0.8128 

0.6702 

0.6374 

0.6944 

0.7911 

0.5957 
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Appendix  B:  Major  Airline  Pilot  Retirements ,  1988  -  2025 


year  total 

1988  481 

1989  519 

1990  727 

1991  798 

1992  1042 

1993  1144 

1994  1361 

1995  1362 

1996  1501 

1997  1575 

1998  1791 

1999  1893 

13194  total,  1990  -  1999 

2000  1882 

2001  1759 

2002  1835 

2003  1536 

2004  1206 

2005  1228 

2006  1304 

2007  1577 

2008  1608 

2009  1643 

2010  1556 

2011  1327 

2012  1257 

2013  1146 

2014  1127 

2015  1072 

2016  1075 

2017  909 

2018  691 

2019  511 

2020  443 

2021  331 

2022  178 

2023  84 

2024  38 

2025  9 

54720  total,  1988  -  2025 
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Appendix  C:  Regression  Coefficients 


Coefficients 


Model/ Variable 

set  12 

set  13 

7YOS 

cut 

0.000246 

0.00025 

lof  ac 

0.086808 

0.088382 

nbf 

-0.04084 

-0.04088 

netgrow 

0.30373 

0.30171 

8Y0S 

cut 

0.000263 

0.000258 

nbf 

-0.01921 

-0.01853 

gnp 

-0.01111 

-0.010836 

comp 

9.6071 

9.7545 

trate7 

0.82341 

0.83905 

9YOS 

cut 

0.000588 

0.000588 

nbf 

-0.05374 

-0.05275 

netgrow 

0.28625 

0.28679 

perc 

27.234 

26.761 

comp 

22.533 

23.275 

10YOS 

cut 

0.000456 

0.000456 

nbf 

-0.05027 

-0.04929 

netgrow 

0.58347 

0.58399 

perc 

20.939 

20.473 

comp 

21.676 

22.407 

11YOS 

If part 

1.6548 

1.076 

gnp 

-0.10384 

-0.07108 

comp 

15.113 

17.585 

nbf 

-0.05335 

-0.03348 

netgrow 

0.37535 

0.4904 

12YOS  "A" 

tratell 

1.201 

1.308 

12YOS  " B " 

tratell 

1.0653 

1.0212 

tratel4 

0.18846 

0.35044 

12YOS  "C" 

lfpart 

-1.0194 

-0.9906 

gnp 

0.050127 

0.048751 

lof  ac 

0.15224 

0.14814 

tratell 

1.4618 

1.4518 
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Coefficients 


Model /Variable 

set  12 

set  13 

13YOS 

lfpart 

-2.5602 

-2.0371 

lof  ac 

0.31756 

0.28397 

perc 

146.39 

114.41 

mttot 

0.000167 

0.000136 

trate!4 

0.8462 

0.60085 

14Y0S 

cut 

0.000226 

0.000245 

lof  ac 

0.17207 

0.17912 

nbf 

-0.03054 

-0.03071 

netgrow 

0.30008 

0.29102 

comp 

21.731 

24.406 
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Appendix  D:  Model  R* ,  Ra  *,  P,  T,  and 
Approximate  Wilk-Shapiro  Values 


Model 


Set 

7Y0S 

8Y0S 

9Y0S 

10YOS 

11Y0S 

11 

0.942 

0.9628 

0.9764 

0.9663 

0.9752 

R-squared 

12 

0.8949 

0.9685 

0.9748 

0.9625 

0.9413 

13 

0.9202 

0.9761 

0.978 

0.969 

0.9193 

11 

0.884 

0.9257 

0.9528 

0.9266 

0.9504 

Adjusted 

12 

0.8073 

0.9423 

0.9537 

0.9313 

0.8979 

R-squared 

13 

0.8632 

0.959 

0.9622 

0.9469 

0.8617 

11 

0.0041 

0.0014 

0.0005 

0.0013 

0.0005 

Model  "P" 

12 

0.0067 

0.0002 

0.0001 

0.0003 

0.0011 

13 

0.001 

0.0 

0.0 

0.0 

0.0011 

11 

8 

7 

6 

8 

6 

T 

12 

8 

7 

8 

9 

8 

12 

8 

8 

6 

8 

6 

11 

0.9394 

0.9842 

0.9705 

0.9644 

0.9414 

Approximate 

12 

0.8959 

0.9849 

0.9363 

0.9614 

0.938 

Wilk-Shapiro 

13 

0.9349 

0.9798 

0.9418 

0.9615 

0.9682 

12YOS"A" 

12YOS"B" 

12YOS"C 

"  13Y0S 

14YOS 

11 

0.8604 

0.8631 

0.8692 

0.9759 

0.9839 

R-squared 

12 

0.8701 

0.8747 

0.9102 

0.93 

0.9333 

13 

0.8881 

0.9014 

0.9385 

0.9081 

0.9408 

11 

0.8448 

0.8239 

0.782 

0.9519 

0.9679 

Adjusted 

12 

0.8571 

0.8468 

0.8588 

0.8717 

0.8776 

R-squared 

13 

0.878 

0.8816 

0.9078 

0.8424 

0.8985 

11 

0.0 

0.0004 

0.0081 

0.0005 

0.0002 

Model  "P” 

12 

0.0 

0.0001 

0.0009 

0.0021 

0.0018 

13 

0.0 

0.0 

0.0001 

0.0016 

0.0004 

11 

9 

9 

9 

6 

7 

T 

12 

9 

9 

7 

9 

7 

13 

7 

9 

7 

8 

5 

11 

0.9483 

0.91 

0.9672 

0.9649 

0.9607 

Approximate 

12 

0.9775 

0.9841 

0.9423 

0.9663 

0.9304 

Wilk-Shapiro 

13 

0.9644 

0.9564 

0.9458 

0.9229 

0.9356 

Appendix  E:  Model  Forecasts 


7YOS 

8YOS 

year 

rate 

forecast 

alternate 

rate 

forecast 

alternate 

77 

0.848 

0.8414 

0.8408 

0.9313 

0.9258 

0.9269 

78 

0.7505 

0.7579 

0.7531 

0.8091 

0.8142 

0.8169 

79 

0.6374 

0.6746 

0.6772 

0.6937 

0.7382 

0.7375 

80 

0.6675 

0.6591 

0.65 

0.7992 

0.7535 

0.7591 

81 

0.7925 

0.7678 

0.7731 

0.8418 

0.8505 

0.8497 

82 

0.8521 

0.865 

0.8708 

0.9033 

0.9141 

0.9118 

83 

0.8745 

0.8611 

0.8611 

0.9429 

0.9352 

0.9345 

84 

0.8024 

0.8347 

0.825 

0.9018 

0.9102 

0.9152 

85 

0.7494 

0.7215 

0.7252 

0.8143 

0.8208 

0.8173 

86 

0.7243 

0.6557 

0.6708 

0.7918 

0.7846 

0.7754 

87 

0.6059 

0.5589 

0.5776 

0.7237 

0.7224 

0.7079 

88 

0.4702 

0.5927 

0.6223 

0.63 

0.6015 

0.5721 

89 

0.407 

0.4359 

0.4809 

0.4741 

0.4926 

0.4448 

90 

0.431 

0.4913 

0.4436 

0.3771 

91 

0.5539 

0.5974 

0.6628 

0.6254 

92 

0.4685 

0.5241 

0.5136 

0.4528 

9YOS 

10YOS 

77 

0.9513 

0.9494 

0.9491 

0.9515 

0.9522 

0.9511 

78 

0.8718 

0.8721 

0.876 

0.9343 

0.9214 

0.9209 

79 

0.7717 

0.7939 

0.7967 

0.8126 

0.8455 

0.8455 

80 

0.8262 

0.8404 

0.8451 

0.8976 

0.8949 

0.8928 

81 

0.89 

0.8707 

0.8704 

0.9156 

0.9078 

0.9078 

82 

0.94 

0.9413 

0.9394 

0.9591 

0.9616 

0.9626 

83 

0.9616 

0.9617 

0.9609 

0.9782 

0.9768 

0.9772 

84 

0.9479 

0.9528 

0.9545 

0.9632 

0.9664 

0.9643 

85 

0.8634 

0.85 

0.8489 

0.8782 

0.8982 

0.9002 

86 

0.8262 

0.7807 

0.7735 

0.8475 

0.8235 

0.8301 

87 

0.7875 

0.7985 

0.7872 

0.8597 

0.8215 

0.8301 

88 

0.7423 

0.761 

0.7434 

0.7756 

0.7998 

0.8147 

89 

0.6374 

0.6673 

0.6379 

0.6944 

0.7193 

0.7453 

90 

0.5606 

0.5154 

0.6427 

0.6831 

91 

0.7695 

0.743 

0.7908 

0.8122 

92 

0.6256 

0.5856 

0.7138 

0.7472 

82 


*-* 


11YOS 

12Y0S  " 

year 

rate 

forecast 

alternate 

rate 

forecast 

77 

0.9684 

0.9592 

0.9567 

0.9692 

0.9784 

78 

0.9487 

0.9515 

0.9558 

0.9655 

0.9591 

79 

0.8733 

0.9043 

0.91 

0.893 

0.8662 

80 

0.9244 

0.939 

0.9491 

0.9187 

0.9319 

81 

0.9585 

0.9506 

0.9478 

0.9781 

0.969 

82 

0.9681 

0.9653 

0.9628 

0.9728 

0.9781 

83 

0.9747 

0.9738 

0.9732 

0.9841 

0.9839 

84 

0.9632 

0.9701 

0.9715 

0.9674 

0.9736 

85 

0.9344 

0.9285 

0.9248 

0.9564 

0.9435 

86 

0.8762 

0.8388 

0.8366 

0.9365 

0.8701 

87 

0.8559 

0.8139 

0.8032 

0.852 

0.8416 

88 

0.8128 

0.8293 

0.8013 

0.6702 

0.7771 

89 

0.7911 

0.8455 

0.7903 

0.5957 

0.7428 

90 

0.8214 

0.7471 

0.7904 

91 

0.8408 

0.7781 

0.8196 

92 

0.8769 

0.809 

0.8712 

12YOS  "B" 

12YOS  "i 

77 

0.9692 

0.9754 

0.9758 

0.9692 

0.9794 

78 

0.9655 

0.9504 

0.9502 

0.9655 

0.9651 

79 

0.893 

0.8715 

0.8624 

0.893 

0.8752 

80 

0.9187 

0.9253 

0.9229 

0.9187 

0.9143 

81 

0.9781 

0.9681 

0.9679 

0.9781 

0.9687 

82 

0.9728 

0.9817 

0.9816 

0.9728 

0.9796 

83 

0.9841 

0.9832 

0.9836 

0.9841 

0.9791 

84 

0.9674 

0.9766 

0.9764 

0.9674 

0.9721 

85 

0.9564 

0.9475 

0.9455 

0.9564 

0.9499 

86 

0.9365 

0.8888 

0.88 

0.9365 

0.9192 

87 

0.852 

0.8503 

0.8384 

0.852 

0.844 

88 

0.6702 

0.7795 

0.7595 

0.6702 

0.7907 

89 

0.5957 

0.7025 

0.6774 

0.5957 

0.59 

90 

0.7677 

0.7493 

0.7284 

91 

0.8178 

0.8033 

0.7478 

92 

0.8482 

0.8402 

0.7707 

alternate 

0.9786 

0.9583 

0.8556 

0.9291 

0.9688 

0.9783 

0.9843 

0.9736 

0.9416 

0.8611 

0.8292 

0.756 

0.7167 

0.7712 

0.8043 

0.8622 


0.9803 

0.967 

0.8783 

0.9125 

0.9683 

0.9793 

0.9781 

0.9712 

0.949 

0.9205 

0.8367 

0.7794 

0.5352 

0.6976 

0.7151 

0.7306 
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13YOS 

14YOS 

year 

rate 

forecast 

alternate 

rate 

forecast 

alternate 

77 

0.9743 

0.9759 

0.9747 

0.9867 

0.9877 

0.9886 

78 

0.9745 

0.971 

0.97 

0.9758 

0.9772 

0.9764 

79 

0.9469 

0.9472 

0.9484 

0.9737 

0.9692 

0.9693 

80 

0.9522 

0.9581 

0.9581 

0.9749 

0.9747 

0.975 

81 

0.9808 

0.9803 

0.9809 

0.9875 

0.9881 

0.9885 

82 

0.9895 

0.9883 

0.9883 

0.9948 

0.9938 

0.9938 

83 

0.9617 

0.9619 

0.9616 

0.9915 

0.9921 

0.992 

84 

0.9573 

0.96 

0.96 

0.9929 

0.9943 

0.9946 

85 

0.955 

0.9595 

0.9606 

0.9864 

0.9826 

0.9818 

86 

0.9864 

0.9834 

0.9837 

0.9816 

0.9784 

0.9777 

87 

0.967 

0.9615 

0.9622 

0.972 

0.9651 

0.9641 

88 

0.9413 

0.9635 

0.9657 

0.9603 

0.9691 

0.9673 

89 

0.9014 

0.874 

0.887 

0.9315 

0.9454 

0.9408 

90 

0.9242 

0.9323 

0.9468 

0.9409 

91 

0.9285 

0.9348 

0.9631 

0.9605 

92 

0.7999 

0.8291 

0.9534 

0.9478 

84 


Appendix  F:  Graphs  of  Residuals  vs.  Predicted  Values 


pnpjsay  SOAZ 
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0.5708  0.8958  1.0734  1.2748  1.456  1.8351  1.9968 
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Appendix  G:  Graphs  of  Residuals  vs.  Time 


Year 
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Appendix  I:  Suggested  Sources  for  Data 


Published  Sources 

1.  Bureau  of  the  Census.  Current  Industrial  Reports , 
Series  MA-37D.  Washington:  Government  Printing 
Office,  July  1990. 

2.  Bureau  of  the  Census.  Current  Population  Reports, 
Series  P-25,  No.  1057.  Washington:  Government 
Printing  Office,  March,  1990. 

3.  Bureau  of  the  Census.  Statistical  Abstract  of  the 
United  States:  1989.  Washington:  Government  Printing 
Office,  1989. 

4.  Bureau  of  Economic  Analysis.  Survey  of  Current 
Business.  Washington:  Government  Printing  Office, 
March,  1990. 

5.  Federal  Aviation  Administration.  FAA  Aviation 
Forecasts  —  Fiscal  Years  1989-2000.  Washington: 
Government  Printing  Office,  March,  1989. 


Other  Sources 

1.  Air  Transport  Association  of  America 
1709  New  York  Avenue,  NW 
Washington,  DC  20006-5206 

2.  Aerospace  Industries  Association  of  America 
1250  Eye  Street,  NW 

Washington,  DC  20036 

3.  Aviation  Resources,  Inc. 

201  Smokerise  Trace 
Peachtree  City,  GA  30269 

4.  Aviation  Week  and  Space  Technology 
Suite  1200 

1120  Vermont  Avenue 
Washington,  DC  20005 

5.  Boeing  Company 

7755  E.  Marginal  Way,  South 
Seattle,  WA  98108 
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6.  Future  Aviation  Professionals  of  America 
4959  Massachussetts  Boulevard 
Atlanta,  GA  30337-6607 

7.  McDonnell-Douglas  Corporation 
P.0.  Box  516 

St.  Louis,  MO  63166 
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